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Description 

FIELD OF TECHNOLOGY 

5 The present invention relates to a novel DNA and a process for preparing a protein which possesses an activity to 
inhibit osteoclast differentiation and/or maturation (hereinafter called osteoclastogenesis-inhibitory activity) by a genetic 
engineering technique using the DNA. More particularly, the present invention relates to a genomic DNA encoding a 
protein OCIF which possesses an osteoclastogenesis-inhibitory activity and a process for preparing said protein by a 
. genetic engineering technique using the genomic DNA 

w 

BACKGROUND OF THE INVENTION 

Human bones are constantly repeating a process of resorption and formation. Osteoblasts controlling formation of 
bones and osteoclasts controlling resorption of bones take major roles in this process. Osteoporosis is a typical disease 

>5 caused by abnormal metabolism of bones. This disease is caused when bone resorption by osteoclasts exceeds bone 
formation by osteoblasts. Although the mechanism of this disease is still to be elucidated completely, the disease 
causes the bones to ache, makes the bones fragile, and may results in fracturing of the bones. As the population of the 
aged increases, this disease results in an increase in bedridden aged people which becomes a social problem. Urgent 
development of a therapeutic agent for this disease is strongly desired. Disease due to a decrease in bone mass is 

20 expected to be treated by controlling bone resorption, accelerating bone formation, or improving balance between bone 
resorption and formation. 

Osteogenesis is expected to increase by accelerating proliferation, differentiation, or activation of the cells control- 
ling bone formation, or by controlling proliferation , differentiation, or activation of the cells involved in bone resorption. 
In recent years, strong interest has been directed to physiologically active proteins (cytokines) exhibiting such activities 

25 as described above, and energetic research is ongoing on this subject The cytokines which have been reported to 
accelerate proliferation or differentiation of osteoblasts include the proteins of fibroblast growth factor family (FGF: 
Ftodan S. B. etal.. Endocrinology vol. 121. p 191 7. 1987), insulin-like growth factor I (IGF-I: Hock J. M. etal., Endocrinol- 
ogy vol. 122, p 254, 1988), insulin growth factor II (IGF-II: McCarthy T. et al.. Endocrinology vol. 124, p 301. 1989), 
Actjvin A (Centrella M. et al., Mol. Cell. Biol., vol. 11, p 250, 1991), transforming growth factor-p, (Noda M.. The Bone, 

30 vol. 2. p 29. 1988). Vasculotropin (Varonique M. et al., Biochem. Biophys. Res. Commun.. vol. 199. p 380. 1994). and 
the protein of heterotopic bone formation factor family (bone morphogenic protein; BMP: BMP-2; Yanaguchi A et al.. J. 
Cell Biol. vol. 1 1 3, p 682, 1 991 . OP-1 ; Sampath T. K. et al., J. Biol. Chem. vol.' 267, p 20532. 1 992, and Knutsen R. et 
al.. Biochem. Biophys. Res. Commun. vol. 194, P 1352, 1993). 

On the other hand, as the cytokines which suppress differentiation and/or maturation of osteoclasts, transforming 

35 growth factor-p (Chenu C, et. al.. Proc. Natl. Acad. Sci. USA. vol. 85, p 5683, 1988), irtterleukin-4 (Kasano K. et al., 
Bone-Miner., vol. 21, p 179, 1993), and the like have been reported. Further, as the cytokines which suppress bone 
resorption by osteoclast, calcitonin (Bone-Miner., vol. 17, p 347, 1992 ), macrophage colony stimulating factor (Hatters- 
ley G. et al., J. Cell. Physiol, vol. 137. p 199. 1988), interleuWn-4 (Watanabe, K. et al., Biochem. Biophys. Res. Com- 
mun. vol. 172. P 1035, 1990), and interferon-)' (Gowen M. et al., J. Bone Miner. Res., vol. I, p 46.9. 1986) have been 

40 reported. 

These cytokines are expected to be used as agents for treating diseases accompanying bone loss by accelerating 
bone formation or suppressing of bone resorption. Clinical tests are being undertaken to verify the effect of improving 
bone metabolism of some cytokines such as insulin-like growth factor-l and the heterotopic bone formation factor family. 
In addition, calcitonin is already commercially available as a therapeutic agent for osteoporosis and a pain relief agent. 

45 At present, drugs for clinically treating bone diseases or shortening the period of treatment of bone diseases include 
activated vitamin D 3 , calcitonin and its derivatives, and hormone preparations such as estradiol agent, iprif lavon or cal- 
cium preparations. These agents are not necessarily satisfactory in terms of the efficacy and therapeutic results. Devel- 
opment of a novel therapeutic agent which can be used in place of these agents is strongly desired. 

In view of this situation, the present inventors have undertaken extensive studies. As a result the present inventors 

so had found protein OCIF exhibiting an osteoclastogenesis-inhibitory activity in a culture broth of human embryonic lung 
fibroblast IMR-90 (ATCC Deposition No. CCL186), and filed a patent application (PCT/JP96/00374). The present inven- 
tors have conducted further studies relating to the origin of this protein OCIF exhibiting the osteoclastogenesis-inhibi- 
tory activity. The studies have matured into determination of the sequence of a genomic DNA encoding the human 
origin OCIF. Accordingly, an object of the present invention is to provide a genomic DNA encoding protein OCIF exhib- 

55 iting osteoclastogenesis-inhibitory activity and a process for preparing this protein by a genetic engineering technique 
using the genomic DNA 
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DISCLOSURE OF THE INVENTION 

Specifically, the present invention relates to a genomic DNA encoding protein OCIF exhibiting osteoclastogenesis- 
inhibitory activity and a process for preparing this protein by a genetic engineering technique using the genomic DNA. 
5 The DNA of the present invention includes the nucleotide sequences No. 1 and No. 2 in the Sequence Table attached 
hereto. 

Moreover, the present invention relates to a process for preparing a protein, comprising inserting a DNA including 
the nucleotide sequences of the sequences No. 1 and No. 2 in the Sequence Table into an expression vector, producing 
a vector capable of expressing a protein having the following physicochemical characteristics and exhibiting the activity 
10 of inhibiting differentiation and/or maturation of osteoclasts, and producing this protein by a genetic engineering tech- 
nique, 

(a) molecular weight (SDS-PAGE): 

15 (i) Under reducing conditions: about 60 kD. 

(ii) Under non-reducing conditions: about 60 kD and about 120 kD; 

(b) amino acid sequence: 

includes an amino acid sequence of the Sequence ID No. 3 of the Sequence Table, 
20 (c) affinity: 

exhibits affinity to a cation exchanger and heparin, and 
(d) thermal stability: 

(i) the osteoclast differentiation and/or maturation inhibitory activity is reduced when treated with heat at 70°C 
25 for 1 0 minutes or at 56°C for 30 minutes. 

pi) the osteoclast differentiation and/or maturation inhibitory activity is lost when treated with heat at 90°C for 
10 minutes. 

The protein obtained by expressing the gene of the present invention exhibits an osteoclastogenesis-inhibitory 
30 activity. This protein is effective as an agent for the treatment and improvement of diseases involving decrease in the 
amount of bone such as osteoporosis, diseases relating to bone metabolism abnormality such as rheumatism, degen- 
erative joint disease, or multiple myeloma, and is useful as an antigen to establish an immunological diagnosis of such 
diseases. 

35 BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 shows a result of Western Blotting analysis of the protein obtained by causing genomic DNA of the present 
invention to express a protein in Example 4 (iii), wherein lane 1 indicates a marker, lane 2 indicates the culture broth of 
COS7 cells in which a vector pWESRaOCIF (Example 4 (iii))has been transferred, and lane 3 is the culture broth of 
40 COS7 cell in which a vector pWESRa(control) has been transfected. 

BEST MODE FOR CARRYING OUT THE INVENTION 

The genomic DNA encoding the protein OCIF which exhibits osteoclastogenesis-inhibitory activity in the present 
45 invention can be obtained by preparing a cosmid library using a human placenta genomic DNA and a cosmid vector 
and by screening this library using DNA fragments which are prepared based on the OCIF cDNA as a probe. The thus- 
obtained genomic DNA is inserted into a suitable expression vector to prepare an OCIF expression cosmid. A recom- 
binant type OCIF can be obtained by transfecting the genomic DNA into a host organism such as various types of cells 
or microorganism strains and causing the DNA to express a protein by a conventional method. The resultant protein 
so exhibiting osteoclastogenesis-inhibitory activity (an osteoclastogenesis-inhibitory factor) is useful as an agent for the 
treatment and improvement of diseases involving a decrease in bone mass such as osteoporosis and other diseases 
relating to bone metabolism abnormality and also as an antigen to prepare antibodies for establishing immunological 
diagnosis of such diseases. The protein of the present invention can be prepared as a drug composition for oral or non- 
oral administration. Specifically, the drug composition of the present invention containing the protein which is an osteo- 
55 clastogenesis-inhibitory factor as an active ingredient can be safely administered to humans and animals. As the form 
of drug composition, a composition for injection, composition tor intravenous drip, suppository, nasal agent, sublingual 
agent, percutaneous absorption agent, and the like are given. In the case of the composition for injection, such a com- 
position is a mixture of a pharmacologically effective amount of osteoclastogenesis-inhibitory factor of the present 
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invention and a pharmaceutical^ acceptable carrier. The composition may further comprise amino acids, saccharides, 
cellulose derivatives, and other excipients and/or activation agents, including other organic compounds and inorganic 
compounds which are commonly added to a composition for injection. When an injection preparation is prepared using 
the osteoclastogenesis-inhibitory factor of the present invention and these excipients and activation agents, a pH 
5 adjuster, buffering agent, stabilizer, solubilizing agent, and the like may be added if necessary to prepare various types 
of injection agents. 

The present invention will now be described in more detail by way of examples which are given for the purpose of 
illustration and not intended to be limiting of the present invention. 

io Example 1 

(Preparation of a cosmid library) 

A cosmid library was prepared using human placenta genomic DNA (Clonetech; Cat. No. 6550-2) and pWE15 cos- 
15 mid vector (Stratagene). The experiment was carried out following principally the protocol attached to the pWE1 5 cos- 
mid vector kit of Stratagene Company, provided Molecular Cloning: A Laboratory Mannual (Cold Spring Harbor 
Laboratory (1989)) was referred to for common procedures for handling DNA, E. coli, and pharge. 

20 

Human placenta genomic DNA dissolved in 750 uJ of a solution containing 10 mM Tris-HCI. 10 mM MgCI 2 , and 100 
mM NaCI was added to four 1 .5 ml Eppendorf tubes (tube A. B, C, and D) in the amount of 100 ug each. Restriction 
enzyme Mbol was added to these tubes in the amounts of 0.2 unit for tube A, 0.4 unit for tube B. 0.6 unit for tube C, and 
0.8 unit for tube D, and DNA was digested for 1 hour. Then. EDTA in the amount to make a 20 mM concentration was 
25 added to each tube to terminate the reaction, followed by extraction with phenol/chloroform (1 :1). A two-fold amount of 
ethanol was added to the aqueous layer to precipitate DNA. DNA was collected by centrif ugation. washed with 70% eth- 
anol. and DNA in each tube was dissolved in 1 00 u> of TE (1 0 mM HCI (pH 8.0) + 1 mM EDTA buffer solution, hereinafter 
called TE). DNA in four tubes was combined in one tube and incubated for 10 minutes at 68°C. After cooling to room 
temperature, the mixture was overlayed onto a 10%-40 % linear sucrose gradient which was prepared in a buffer con- 
30 taining 20 mM Tris-HC1 (pH 8.0), 5 mM EDTA, and 1 mM NaC1 in an centrifugal tube (38 ml). The tube was centrifuged 
at 26,000 rpm for 24 hours at 20°C using a rotor SRP28SA manufactured by Hitachi, Ltd. and 0.4 ml fractions of the 
sucrose gradient was collected using a fraction collector. A portion of each fraction was subjected to 0.4% agarose elec- 
trophoresis to confirm the size of DNA. Fractions containing DNA with a length of 30 kb (kilo base pair) to 40 kb were 
thus combined. The DNA solution was diluted wfth TE to make a sucrose concentration to 10% or less and 2.5-fold vol- 
35 umes of ethanol was added to precipitate DNA. DNA was dissolved in TE and stored at 4°C. 

(ii) Preparation of cosmid vector 

The pWE15 cosmid vector obtained from Stratagene Company was completely digested with restriction enzyme 
40 BamHI according to the protocol attached to the cosmid vector kit DNA collected by ethanol precipitation was dissolved 
in TE to a concentration of 1 mg/m1 . Phosphoric acid at the 5'-end of this DNA was removed using calf small intestine 
alkaline phosphatase, and DNA was collected by phenol extraction and ethanol precipitation. The DNA was dissolved 
in TE to a concentration of 1 mg/ml. 

45 fiii) Ligation of genomic DNA to vector and in vitro packaging 

1.5 micrograms of genomic DNA fractionated according to size and 3 jig of pWE15 cosmid vector which was 
digested with restriction enzyme BamHI were ligated in 20 ul of a reaction solution using Ready-To-Go T4DNA ligase 
of Pharmacia Company. The ligated DNA was packaged in vitro using Gigapack™ II packaging extract (Stratagene) 
so according to the protocol. After the packaging reaction, a portion of the reaction mixture was diluted stepwise with an 
SM buffer solution and mixed with E. coli XL1-Blue MR (Stratagene) which was suspended in 10 mM MgC1 2 to cause 
pharge to infect, and plated onto LB agar plates containing 50 ug/ml of ampicillin. The number of colonies produced was 
counted. The number of colonies per 1 uJ of packaging reaction was calculated based on this result. 

55 Civ) Preparation of a cosmid library 

The packaging reaction solution thus prepared was mixed with E. coli XL1-Blue MR and the mixture was plated 
onto agarose plates containing ampicillin so as to produce 50,000 colonies per agarose plate having a 15 cm of diam- 
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eter. After incubating the plate overnight at 37°C. an LB culture medium was added in the amount of 3 ml per plate to 
suspend and collect colonies of E. coli. Each agarose plate was again washed with 3 ml of the LB culture medium and 
the washing was combined with the original suspension of E. coli. The E. coli collected from all agarose plates was 
placed in a centrifugal tube, glycerol was added to a concentration of 20%, and ampicillin was further added to make a 
5 final concentration of 50 ug/m1 . A portion of the E. coli suspension was removed and the remainder was stored at - 
80°C. The removed E. coli was diluted stepwise and plated onto an agar plates to count the number of colonies per 1 
ml of suspension. 

Example 2 

10 

(Screening of cosmid library and purification of colony) 

A nitrocellulose filter (Millipore) with a diameter of 14.2 cm was placed on each LB agarose plate with a diameter 
of 1 5 cm which contained 50 ug/m1 of ampicillin. The cosmid library was plated onto the plates so as to produce 50,000 

is colonies of E. coli per plate, followed by incubation overnight at 37°C. E. coli on the nitrocellulose filter was transferred 
to another nitrocellulose fitter according to a conventional method to obtain two replica filters. According to the protocol 
attached to the cosmid vector kit, cosmid ONA in the E. coli on the replica filters was denatured with an alkali, neutral- 
ized, and immobilized on the nitrocellulose filter using a Stratalinker (Stratagene). The filters were heated for two hours 
at 80°C in a vacuum oven. The nitrocellulose fitters thus obtained were hybridized using two kinds of DNA produced, 

20 respectively, from 5'-end and 3"-end of human OCIF cDNA as probes. Namely, a plasmid was purified from E. coli 
pKB/OIF10 (deposited at The Ministry of International Trade and Industry, the Agency of Industrial Science and Tech- 
nology. Biotechnology Laboratory, Deposition No. FERM BP-5267) containing OCIF cDNA The plasmid containing 
OCIF cDNA was digested with restriction enzymes Kpnl and EcoRI. Fragments thus obtained was separated using 
agarose gel electrophoresis. Kpnl/EcoRI fragment with a length of 0.2 kb was purified using a QIAEX II gel extraction 

25 kit (Qiagen). This DNA was labeled with 32 p using the Megaprime DNA Labeling System (Amasham) (5'-DNA probe). 
Apart from this, a BamHI/EcoRV fragment with a length of 0.2 kb which was produced from the above plasmid by diges- 
tion with restriction enzymes BamHI and EcoRV was purified and labeled with 32 p (3'-DNA probe). One of the replica 
filters described above was hybridized with the 5'-DNA probe and the other with the 3'-DNA probe. Hybridization and 
washing of the filters were carried out according to the protocol attached to the cosmid vector kit. Autoradiography 

30 detected several positive signals with each probe. One colony which gave positive signals with both probe was identi- 
fied. The colony on the agar plate, which corresponding to the signal on the autoradiogram was isolated and purified. 
A cosmid was prepared from the purified colony by a conventional method. This cosmid was named pWEOCIF. The 
size of human genomic DNA contained in this cosmid was about 38 kb. 

35 Example 3 , 

( Determination of the nucleotide sequence of human OCIF genomic DNA ) 

(i) Subclonina of OCIF genomic DNA 

40 

Cosmid pWEOCIF was digested with restriction enzyme EcoRI. After the separation of the DNA fragments thus 
produced by electrophoresis using a 0.7% agarose gel, the DNA fragments were transferred to a nylon membrane 
(Hybond -N, Amasham) by the Southern blot technique and immobilized on the nylon membrane using Stratalinker 
(Stratagene). On the other hand, plasmid pBKOCIF was digested with restriction enzyme EcoRI and a 1 .6 kb fragment 
45 containing human OCIF cDNA was isolated by agarose gel electrophoresis. The fragment was labeled with 32 P using 
the Megaprime DNA labeling system (Amasham). 

Hybridization of the nylon membranes described above with the 32 P-labeled 1.6-kb OCIF cDNA was performed 
according to a conventional method detected that DNA fragments with a size of 6 kb, 4 kb, 3.6 kb, and 2.6 kb. These 
fragments hybridized with the human OCIF cDNA were isolated using agarose gel electrophoresis and individually sub- 
so cloned into an EcoRI site of pBluescript II SK + vector (Strategene) by a conventional method. The resulting plasmids 
were respectively named pBSE 6. pBSE 4, pBSE 3.6, and PBSE 2.6. 

(ii) Determination of the nucleotide sequence 

55 The nucleotide sequence of human OCIF genomic DNA which was subcloned into the plasmid was determined 
using the ABI Dideoxy Terminator Cycle Sequencing Ready Reaction kit (Perkin Elmer) and the 373 Sequencing Sys- 
tem (Applied Biosystems). The primer used for the determination of the nucleotide sequence was synthesized based 
on the nucleotide sequence of human OCIF cDNA (Sequence ID No. 4 in the Sequence Table). The nucleotide 
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sequences thus determined are given as the Sequences No. 1 and No. 2 in the Sequence Table. The Sequence ID No. 
1 includes the first exon of the OCIF gene and the Sequence ID No. 2 includes the second, third, fourth, and fifth exons. 
A stretch of about 17 kb is present between the first and second exons. 

5 Example 4 

Production of recombinant OCIF using COS-7 cells) 

(0 Preparation of OCIF genomic DNA expression cosmid 

10 

To express OCIF genomic DNA in animal cells, an expression unit of expression plasmid pcDL-SRa296 (Molecular 
and Cellar Biology, vol. 8, P466-472. 1988) was inserted into cosmid vectorpWE15 (Stratagene). First of all. the expres- 
sion plasmid pcDL-SRa296 was digested with a restriction enzyme Sal I to cut out expression unit with a length of about 
1 .7 kb which includes an SRapromotor, SV40 later splice signal, poly (A) addition signal, and so on. The digestion prod- 

J5 ucts were separated by agarose electrophoresis and the 1.7-kb fragment was purified using the QIAEX II gel extraction 
kit (Qiagen). On the other hand, cosmid vector pWE15 was digested with a restriction enzyme EcoRI and fragments 
were separated using agarose gel electrophoresis. pWE15 DNA of 8.2 kb long was purified using the QIAEX II gel 
extraction kit (Qiagen). The ends of these two DNA fragments were bluntled using a DNA blunting kit (Takara Shuzo), 
ligated using a DNA ligation kit (Takara Shuzo), and transferred into E. coli DH5a (Gibco BRL). The resultant transform- 

20 ant was grown and the expression cosmid pWESRa containing an expression unit was purified using a Qiagen column 
(Qiagen). 

The cosmid pWE OCIF containing the OCIF genomic DNA with a length of about 38 kb obtained in 0) above was 
digested with a restriction enzyme Notl to cut out the OCIF genomic DNA of about 38 kb. After separation by agarose 
gel electrophoresis, the DNA was purified using the QIAEX II gel extraction kit (Qiagen). On the other hand, the expres- 

25 sion cosmid pWESRa was digested with a restriction enzyme EcoRI and the digestion product was extracted with phe- 
nol and chloroform, ethanol-precipitated, and dissolved in TE. 

pWESRa digested with a restriction enzyme EcoRI and an EcoRI-Xmnl-Notl adapter (#1 105, #1 156 New England 
Biolaboratory Co.) were ligated using T4 DNA ligase (Takara Shuzo Co., Ltd.). After removal of the free adapter by aga- 
rose gel electrophoresis, the product was purified using QIAEX gel extraction kit (Qiagen). The OCIF genomic DNA with 

30 a length of about 37 kb which was derived from the digestion with restriction enzyme Notl and the pWESRa to which 
the adapter was attached were ligated using T4 DNA ligase (Takara Shuzo). The DNA was packaged in vitro using the 
Gigapack packaging extract (Stratagene) and infected with E. coli XL1 -Blue MR (Stratagene). The resultant transform- 
ant was grown and the expression cosmid pWESRaOCIF which contained OCIF genomic DNA was inserted was puri- 
fied using a Qiagen column (Qiagen). The OCIF expression cosmid pWESRaOCIF was ethanol-precipitated and 

35 dissolved in sterile distilled water and used in the following analysis. 

(ii) Transient expression of OCIF genomic DNA and measurement of OCIF activity 

A recombinant OCIF was expressed as described below using the OCIF expression cosmid pWESRaOCIF 

40 obtained in (i) above and its activity was measured. COS-7 (8x10 5 ce!ls/well) cells (Riken Cell Bank, RCB0539) were 
planted in a 6-well plate using DMEM culture medium (Gibco BRL) containing 10% fetal bovine serum (Gibco BRL). On 
the following day, the culture medium was removed and cells were washed with serum-free DMEM culture medium. The 
OCIF expression cosmid pWESRaOCIF which had been diluted with OPTI-MEM culture medium (Gibco BRL) was 
mixed with lipophectamine and the mixture was added to the cells in each well according to the attached protocol. The 

« expression cosmid pWESRa was added to the cells in the same manner as a control. The amount of the cosmid DNA 
and Lipophectamine was respectively 3 ug and 12 ul. After 24 hours, the culture medium was removed and 1.5 ml of 
fresh EX-CELL 301 culture medium (JRH Bioscience) was added to each well. The culture medium was recovered after 
48 hours and used as a sample for the measurement of OCIF activity. The measurement of OCIF activity was carried 
out according to the method described by Kumegawa, M. et al. (Protein, Nucleic Acid, and Enzyme, Vol. 34. p 999 

so (1989)) and the method of TAKAHASHI, N. et al. (Endocrihology vol. 122. p 1373 (1988)). The osteoclast formation in 
the presence of activated vitamin D 3 from bone marrow cells isolated from mice aged about 17 days was evaluated by 
the induction of tartaric acid resistant acidic phosphatase activity. The inhibition of the acid phosphatase was measured 
and used as the activity of the protein which possesses osteoclastogenesis-inhibitory activity (OCIF). Namely. 100 
ul/well of a OCIF sample which was diluted with a-MEM culture medium (Gibco BRL) containing 2x10' 8 M activated 

55 vitamin D 3 and 10% fetal bovine serum was added to each well of a 96 well micro plate. Then, 3x10 5 bone marrow cells 
isolated from mice (about 17-days okf) suspended in 100 id of a-MEM culture medium containing 10% fetal bovine 
serum were added to each well of the 96 well micro plate and cultured for a week at 37°C and 1 00% humidity under 5% 
C0 2 atmosphere. On days 3 and 5. 1 60 fil of the conditioned medium was removed from each well, and 1 60 fJ of a sam- 
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pie which was diluted with a-MEM culture medium containing 1x10" 8 M activated vitamin D 3 and 10% fetal bovine 
serum was added. After 7 days from the start of culturing. the cells were washed with a phosphate buffered saline and 
fixed with a ethancJ/acetone (1:1) solution for one minute at room temperature. The osteoclast formation was detected 
by staining the cells using an acidic phosphatase activity measurement kit (Acid Phosphatase, Leucocyte, Cat.No. 387- 
5 A, Sigma Company). A decrease in the number of cells positive to acidic phosphatase activity in the presence of tartaric 
acid was taken as the OCIF activity. The results are shown in Table 1, which indicates that the conditioned medium 
exhibits the similar activity to natural type OCIF obtained from the IMR-90 culture medium and recombinant OCIF pro- 
duced by CHO cells. 



TABLE 1 



Activity of OCIF expressed by COS-7 c 


ells in the conditioned medium 




Dilution 


1/10 


1/20 


1/40 


1/80 


1/160 


1/320 


OCIF genomic DNA introduced 






++ 








Vector introduced 














Untreated 














"++" indicates an activity inhibiting 80% or more of osteoclast formation, "+" indicates ar 
of osteoclast formation, and "-" indicates that no inhibition of osteoclast formation is obs 


activity inhibiting 30-80% 
erved. 



(iii) Identification of the product by Western Blotting 

25 A buffer solution (1 0 ul) for SDS-PAGE (0.5 M Tris-HC1 . 20% glycerol, 4% SDS, 20 [ig/ml bromophenol blue. pH 
6.8) was added to 10 u1 of the sample for the measurement of OCIF activity prepared in (ii) above. After boiling for 3 
minutes at 100°C, the mixture was subjected to 10% SDS polyacrylamide electrophoresis under non-reducing condi- 
tions. The proteins were transferred from the gel to a PVDF membrane (ProBlott, Perkin Elmer) using semi-dry blotting 
apparatus (Biorad). The membrane was blocked and incubated for 2 hours at 37°C together with a horseradish perox- 

30 idase-labeled antj-OCIF antibody obtained by labeling the previously obtained OCIF protein with horseradish peroxi- 
dase according to a conventional method. After washing, the protein which has bound the anti-OCIF antibody was 
detected using the ECL system (Amasham). As shown in Figure 1, two bands, one with a molecular weight of about 
120 kilo dalton and the other 60 kilo dalton. were detected in the supernatant obtained from the culture broth of COS- 
7 cells in which pWESRaOCIF was transfected. On the other hand, these two bands with a molecular weight of about 

35 120 kilo dalton and 60 kilo dafton were not detected in the supernatant obtained from the culture broth of COS-7 cells 
in which pWESRccvector was transfected, confirming that the protein obtained was OCIF. 

INDUSTRIAL APPLICABILITY 

40 The present invention provides a genomic DNA encoding a protein OCIF which possesses an osteoclastogenesis- 
inhibitory activity and a process for preparing this protein by a genetic engineering technique using the genomic DNA. 
The protein obtained by expressing the gene of the present invention exhibits an osteoclastogenesis-inhfoitory activity 
and is useful as an agent for the treatment and improvement of diseases involving a decrease in the amount of bone 
such as osteoporosis, other diseases resulting from bone metabolism abnormality such as rheumatism or degenerative 

45 joint disease, and multiple myeloma. The protein is further useful as an antigen to establish antibodies useful for an 
immunological diagnosis of such diseases. 

NOTE ON MICROORGANISM 

so Depositing Organization: 

The Ministry of International Trade and Industry, National Institute of Bioscience and 
Human Technology, Agency of Industrial Science and Technology 
Address: 1 -3, Higashi-1-Chome. Tsukuba-shi. Ibarakj-ken, Japan 

Date of Deposition: . June 21, 1995 (originally deposited on June 21, 1995 and transferred to the international 
55 deposition according to the Budapest Treaty on October 25, 1 995) 

Accession No. FERM BP-5267 
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TABLE QF SEQUENCES 

Sequence number: 1 
Length of sequence: 1316 
Sequence Type: nucleic acid 
Strandedness: double 
Topology: linear 

Molecular type: genomic DNA (human OCIF genomic DNA-1) 



Sequence: 



CTGGAGACAT 


ATAACTTGAA 


CACTTGGCCC 


TGATGGGGAA 


GCAGCTCTCC AGGGACTTTT 


60 


TCACCCATCT 


GTAAACAATT 


TCAGTGGCAA 


CCCGCGAACT 


GTAATCCATG AATGGGACCA 


120 


WW* 111 IU/i\n 


GTCATCAAGT 


CTAACTTCTA 


GACCAGGGAA 


TTAATCGGGG AGACAGCGAA 


180 


CCCTACACCA 


AAGTGCCAAA 


CTTCTGTCGA 


TAGCTTGAGG 


CTAGTCGAAA CACCTCGACG 


240 


AGCCTACTCC 


AGAAGTTCAG 


CGCGTAGGAA 


GCTCCGATAC 


CAATAGCCCT TTGATGATGC 


300 


TCCGGTTGGT 


GAAGGGAACA 


GTGCTCCGCA 


AGGTTATCCC 


TGCCCCAGCC AGTCCAATTT 


360 


TCACTCTGCA 


GATTCTCTCT 


GGCTCTAACT 


ACCCCAGATA 


ACAAGGAGTG AATCCAGAAT 


420 


AGCACCGGCT 


TTAGGGCCAA 


TCAGACATTA 


CTTAGAAAAA 


TTCCTACTAC ATGGTITATG 


480 


TAAACTTGAA 


GATGAATGAT 


TCCGAACTCC 


CCGAAAAGGG 


CTCACACAAT GCCATGCATA 


540 


AAGAGGCGCC 


CTGTAAnTG 


AGGTTTCAGA 


ACCCGAAGTG 


AAGGGGTCAG GCAGCCGGGT 


600 


ACGGCGGAAA 


CTCACAGCTT 


TCGCCCAGCG 


AGAGGACAAA 


GGTCTGGGAC ACACTCCAAC 


660 


TGCGTCCGGA 


TCTTGGCTGG 


ATCGGACTCT 


CAGGGTGGAG 


GAGACACAAG CACAGCAGCT 


720 


GCCCAGCGTG 


TGCCCAGCCC 


TCCCACCGCT 


GGTCCCGGCT 


GCCAGGAGGC TGGCCGCTGG 


780 


CGGGAAGGGG 


CCGGGAAACC 


TCAGAGCCCC 


GCGGAGACAG 


CAGCCGCCTT GTTCCTCAGC 


840 


CCGGTGGCTT 


TTTnTCCCC 


TGCTCTCCCA 


GGGGACAGAC 


ACCACCGCCC CACCCCTCAC 


900 


GCCCCACCTC 


CCTGGGGGAT 


CCTTTCCGCC 


CCAGCCCTGA 


AAGCGTTAAT CCTGGAGCTT 


960 


TCTGCACACC 


CCCCGACCGC 


TCCCGCCCAA 


GCTTCCTAAA 


AAAGAAAGGT GCAAAGTTTG 1020 


GTCCAGGATA 


GAAAAATGAC 


TGATCAAAGG 


CAGGCGATAC 


TTCCTGnGC CGGGACGCTA 1080 


TATATAACGT 


GATGAGCGCA 


CGGGCTGCGG 


AGACGCACCG 


GAGCGCTCGC CCACCCGCCC 1140 
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CCTCCAAGCC CCTGAGGTTT CCGGCGACCA CA ATC AAC AAG TTC CTG TCC TCC 1193 
Net Asn Lys Leu Leu Cys Cys 
-20 -15 

GCG CTC GTG GTAAGTCCCT GGGCCAGCCG ACGGGTGCCC GGCGCCTGGG 1242 
Ala Leu Val 

GAGGCTGCTG CCACCTGGTC TCCCAACCTC CCAGCGGACC GGCGGGGAGA ACGCTCCACT 1302 
CGCTCCCTCC CAGG 1316 

Sequence number: 2 
Length of sequence: 9898 
Sequence Type: nucleic acid 
Strandedness: double 
Topology: linear 

Molecular type: genomic DNA (human OCIF genomic DNA-2) 
Sequence : 

GCTTACTTTG TGCCAAATCT CATTAGGCTT AAGGTAATAC AGGACTTTGA GTCAAATGAT 60 
ACTGTTGCAC ATAAGAACAA ACCTATTTTC ATGCTAAGAT GATGCCACTG TGTTCCTTTC 120 
TCCTTCTAG TTT CTG GAC ATC TCC ATT AAG TGG ACC ACC CAG GAA ACG TTT 171 
Phe Leu Asp He Ser He Lys Trp Thr Thr Gin Glu Thr Phe 
-10 -5 1 

CCT CCA AAG TAC CTT CAT TAT GAC GAA GAA ACC TCT CAT CAG CTG TTG 219 
Pro Pro Lys Tyr Leu His Tyr Asp Glu Glu Thr Ser His Gin Leu Leu 
5 10 15 
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TGT GAC AAA TGT CCT CCT GGT ACC TAC CTA AAA CAA CAC TGT ACA GCA 267 
Cys Asp Lys Cys Pro Pro Gly Thr Tyr Leu Lys Gin His Cys Thr Ala 
20 25 30 35 

AAG TGG AAG ACC GTG TGC GCC CCT TGC CCT GAC CAC TAC TAC ACA GAC 315 
Lys Trp Lys Thr Val Cys Ala Pro Cys Pro Asp His Tyr Tyr Thr Asp 
40 45 50 

AGC TGG CAC ACC AGT GAC GAG TGT CTA TAC TGC ACC CCC GTG TGC AAG 363 
Ser Trp His Thr Ser Asp Glu Cys Leu Tyr Cys Ser Pro Yal Cys Lys 
55 60 65 

GAG CTG CAG TAC GTC AAG CAG GAG TGC AAT CGC ACC CAC AAC CGC GTG 411 
Glu Leu Glo Tyr Val Lys Gin Glu Cys Asn Arg Thr His Asn Arg Val 
70 75 80 

TGC CM TGC AAG GAA GGG CGC TAC CTT GAG ATA GAG TTC TGC TTG AAA 459 
Cys Glu Cys Lys Glu Gly Arg Tyr Leu Glu lie Glu Phe Cys Leu Lys 
85 90 95 

CAT AGG AGC TGC CCT CCT CGA.TTT GGA GTG GTG CAA GCT G GTACGTGTCA 509 
His Arg Ser Cys Pro Pro Gly Phe Gly Val Val Gin Ala 
100 105 110 

ATGTGCAGCA AAATTMTTA GGATCATGCA AAGTCAGATA GTTGTGACAG TTTAGGAGAA 569 
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CACTTTTCTT CTCATGACAT TATACCATAG CAAATTGCAA AGGTAATGAA ACCTGCCAGG 629 
TAGGTACTAT GTGTCTGGAG TGCTTCCAAA GGACCATTGC TCAGAGGAAT ACTTTCCCAC 689 
TACAGGGCAA TTTAATGACA AATCTCAAAT GCAGCAAATT ATTCTCTCAT GAGATGCATG 749 
ATGGTTrTTT TTTTTTTTTT TAAAGAAACA AACTCAAGTT CCACTATTGA TAGTTGATCT 809 
ATACCTCTAT ATTTCACTTC AGCATCGACA CCTTCAAACT GCAGCACTTT TTGACAAACA 869 
TCAGAAATGT TAATTTATAC CAAGAGAGTA ATTATGCTCA TATTAATGAG ACTCTGGAGT m 
GCTAACAATA AGCAGTTATA ATTAATTATG TAAAAAATGA GAATGGTGAG GGGAATTGCA 989 
TTTCAITATT AAAAACAAGG CTAGTTCTTC CTTTAGCATG GGAGCTGAGT GTTTGGGAGG 1049 
GTAAGGACTA TAGCAGAATC TCTTCAATGA GCTTATTCTT TATCTTAGAC AAAACAGATT 1109 
GTCAAGCCAA GAGCAAGCAC TTGCCTATAA ACCAAGTGCT TTCTCTTnG CATTTTGAAC 1169 
AGCATTGGTC AGGGCTCATG TGTATTGAAT CTTTTAAACC AGTAACCCAC GTTTTTTTTC 1229 
TGCCACATTT GCGAAGCTTC AGTGCAGCCT ATAACTTTTC ATAGCTTGAG AAAATTAAGA 1289 
GTATCCACTT ACTTAGATGG AAGAAGTAAT CAGTATAGAT TCTGATGACT CAGTTTGAAG 1349 
CAGTGTTTCT CAACTGAAGC CCTGCTGATA TTTTAAGAAA TATCTGGATT CCTAGGCTGG 1409 
ACtCCnm GTGGGCAGCT GTCCTGCGCA TTGTAGAATT TTCGCAGCAC CCCTGGACTC 1469 
TAGCCACTAG ATACCAATAG CAGTCCTTCC CCCATGTGAC AGCCAAAAAT GTCTTCAGAC 1529 
ACTGTCAAAT GTCGCCAGGT GGCAAAATCA CTCCTGGTTG AGAACAGGGT CATCAATGCT 1589 
AAGTATCTGT AACTATTTTA ACTCTCAAAA CTTGTGATAT ACAAAGTCTA AATTATTAGA 1649 
CGACCAATAC TTTAGGTTTA AAGGCATACA AATGAAACAT TCAAAAATCA AAATCTATTC 1709 
TGTTTCTCAA ATAGTGAATC TTATAAAATT AATCACAGAA GATGCAAATT GCATCAGAGT 1769 
CCCTTAAAAT TCCTCTTCGT ATGAGTATTT GAGGGAGGAA TTGGTGATAG TTCCTACTTT 1829 
CTATTGGATG GTACTTTGAG ACTCAAAAGC TAAGCTAAGT TGTGTGTGTG TCAGGGTGCG 1889 
GGGTGTGGAA TCCCATCAGA TAAAAGCAAA TCCATGTAAT TCATTCAGTA AGTTGTATAT 1949 
GTAGAAAAAT GAAAAGTGGG CTATGCAGCT TGGAAACTAG AGAATTTTGA AAAATAATGG 2009 
AAATCACAAG GATCTnCTT AAATAAGTAA GAAAATCTGT TTGTAGAATG AAGCAAGCAG 2069 
GCAGCCAGAA GACTCAGAAC AAAAGTACAC ATTTTACTCT GTGTACACTG GCAGCACAGT 2129 
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GGCATTTATT TACCTCTCCC TCCCTAAAAA CCCACACACC GGTTCCTCTT GGGAAATAAG 2189 
AGGTTTCCAG CCCAAAGACA AGGAAAGACT ATGTGGTGTT ACTCTAAAAA GTATTTAATA 2249 
ACCGTTTTGT TGTTGCTGTT GCTGTTTTGA AATCAGATTG TCTCCTCTCC ATATTTTATT 2309 
TACTTCATTC TGTTAATTCC TGTGGAATTA CTTAGAGCAA GCATGGTGAA TTCTCAACTG 2369 
TAAAGCCAAA TTTCTCCATC ATTATAATTT CACATTTTGC CTGGCAGGTT ATAATTTTTA 2429 
TATTTCCACT GATAGTAATA AGGTAAAATC ATTACTTAGA TGGATAGATC TTTTTCATAA 2489 
AAAGTACCAT CAGTTATAGA GGGAAGTCAT GTTCATGTTC AGGAAGGTCA TTAGATAAAG 2549 
CTTCTGAATA TATTATGAAA CATTAGTTCT GTCATTCTTA GATTCTTTTT CTTAAATAAC 2609 
TTTAAAAGCT AACTTACCTA AAAGAAATAT CTGACACATA TGAACTTCTC ATTAGGATGC 2669 
AGGAGAACAC CCAAGCCACA GATATGTATC TCAAGAATGA ACAAGATTCT TAGGCCCGGC 2729 
ACGGTGGCTC ACATCTGTAA TCTCAAGAGT TTGAGAGGTC AAGGCGGGCA GATCACCTGA 2789 
GGTCAGGAGT TCAAGACCAG CCTGGCCAAC ATGATGAAAC CCTGCCTCTA CTAAAAATAC 2849 
AAAAATTAGC AGGGCATGGT GGTGCATGCC TGCAACCCTA GCTACTCAGG AGGCTGAGAC 2909 
AGGAGAATCT CTTGAACCCT CGAGGCGGAG GTTGTGGTGA GCTGAGATCC CTCTACTGCA 2969 
CTCCAGCCTG GGTGACAGAG ATGAGACTCC GTCCCTGCCG CCGCCCCCGC CTTCCCCCCC 3029 
AAAAAGATTC TTCTTCATGC AGAACATACG GCAGTCAACA AAGGCAGACC TGGGTCCAGG 3089 
TGTCCAAGTC ACTTATTTCG AGTAAATTAG CAATGAAAGA ATGCCATGGA ATCCCTGCCC 3149 
AAATACCTCT GCTTATGATA TTGTAGAATT TGATATAGAG TTGTATCCCA TTTAAGGAGT 3209 
AGGATGTAGT AGGAAAGTAC TAAAMCAAA CACACAAACA GAAAACCCTC TTTGCTTTGT 3269 
AAGGTGGXTC CTAAGATAAT GTCAGTGCAA TGCTGGAAAT AATATTTAAT ATGTGAAGGT 3329 
TTTAGGCTGT GTTTTCCCCT CCTGTTCTTT TTTTCTGCCA GCCCTTTGTC ATTTTTGCAG 3389 
GTCAATGAAT CATGTAGAAA GAGACAGGAG ATGAAACTAG AACCAGTCCA TTTTGCCCCT 3449 
TTTTTTATTT TCTGGTTTTG GTAAAAGATA CAATGAGGTA GGAGGTTGAG ATTTATAAAT 3509 
GAAGTTTAAT AAGTTTCTGT AGCTTTGATT TTTCTCTTTC ATATTTGTTA TCTTGCATAA 3569 
GCCAGAATTG GCCTGTAAAA TCTACATATG GATATTGAAG TCTAAATCTG TTCAACTAGC 3629 
nACACTAGA TGGAGATATT ftCATATTCA GATACACTGG AATGTATGAT CTAGCCATGC 3689 
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GTAATATAGT CAAGTCTTTG AAGGTATTTA TTrTTAATAG CGTCTTTAGT TGTGGACTGG 3749 
nCAAGTTTT TCTGCCAATG ATTTCTTCAA ATTTATCAAA TATTTTTCCA TCATGAAGTA 3809 
AAATGCCCTT GCAGTCACCC TTCCTGAAGT TTGAACGACT CTGCTGnTT AAACAGTTTA 3869 
AGCAAATGGT ATATCATCTT CCGTTTACTA TGTAGCTTAA CTGCAGGCTT ACGCTTTTGA 3929 
GTCAGCGGCC AACTTTATTG CCACCTTCAA AAGTTTATTA TAATGnGTA AATTTTTACT 3989 
TCTCAAGGTT AGCATACTTA GGAGTTGCTT CACAATTAGG ATTCAGGAAA GAAAGAACTT 4049 
CAGTAGGAAC TGATTGGAAT TTAATGATGC AGCATTCAAT GGGTACTAAT TTCAAAGAAT 4109 
GATATTACAG CAGACACACA GCAGTTATCT TGATTTTCTA GGAATAATTG TATGAAGAAT 4169 
ATGGCTGACA ACACGCCCTT ACTGCCACTC AGCGGAGGCT GGACTAATGA ACACCCTACC 4229 
CTTCTTTCCT TTCCTCTCAC ATTTCATGAG CGTTTTGTAG GTAACGAGAA AATTGACTTG 4289 
CATTTGCATT ACAAGGAGGA CAAACTGGCA AAGGGGATGA TGGTGGAAGT TTTGTTCTGT 4349 
CTAATGAAGT GAAAAATGAA AATGCTAGAG TTTTGTGCAA CATAATAGTA GCAGTAAAAA 4409 
CCAAGTGAAA AGTCTTTCCA AAACTGTGTT AAGAGGGCAT CTGCTGGGAA ACGATTTGAG 4469 
GAGAAGGTAC TAAATTGCTT GGTATTTTCC GTAG GA ACC CCA GAG CGA AAT ACA 4523 
Giy Tbr Pro Glu Arg Asn Thr 
115 

CTT TGC AAA AGA TGT CCA GAT GGG TTC TTC TCA AAT GAG ACG TCA TCT 4571 
Val Cys Lys Arg Cys Pro Asp Gly Phe Phe Ser Asn Glu Thr Ser Ser 
120 125 130 135 

AAA GCA CCC TGT AGA AAA CAC ACA AAT TGC ACT GTC TTT GGT CTC CTG 4619 
lys Ala Pro Cys Arg Lys His Thr Asn Cys Ser Val Phe Gly Leu Leu 
140 145 150 

CTA ACT CAG AAA GGA AAT GCA ACA CAC GAC AAC ATA TGT TCC GGA AAC 4667 
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Leu Thr Gin Lys Gly Asn Ala Thr His Asp Asa He Cys Ser Cly Asn 
155 160 165 

ACT GAA TCA ACT CAA AAA TGT GGA ATA G GTAATTACAT TCCAAAATAC 4715 
Ser Glu Ser Thr GIq Lys Cys Gly He 
170 175 

GTCTTTGTAC GATTTTGTAG TATCATCTCT CTCTCTGAGT TGAACACAAG GCCTCCACCC 4775 
ACATTCTTGC TCAAACTTAC ATTTTCCCTT TCTTGAATCT TAACCAGCTA AGGCTACTCT 4835 
CGATGCATTA CTGCTAAAGC TACCACTCAC AATCTCTCAA AAACTCA TCT TCTCACAGAT 4895 
AACACCTCAA AGCTTGATTT TCTCTCCTTT CACACTGAAA TCAAATCTTC CCCATAGGCA 4955 
AAGGGCAGTG TCAAGTTTGC CACTGAGATG AAATTAGGAG AGTCCAAACT GTAGAATTCA 5015 
CGTTGTGTGT TATTACTTTC ACGAATGTCT GTATTATTAA CTAAAGTATA TATTGGCAAC 5075 
TAAGAAGCAA AGTGATATM ACATGATGAC AAATTAGGCC AGGCATGGTG GCTTACTCCT 51S5 
ATAATCCCAA CATTTTGGGG GGCCAAGGTA GGCAGATCAC TTGAGGTCAG GATTTCAAGA 5195 
CCAGCCTGAC CAACATGGTG AAACCTTGTC TCTACTAAM ATACAAAAAT TAGCTGGGCA 5255 
TGGTAGCAGG CACTTCTAGT ACCAGCTACT CAGGGCTGAG GCAGGAGAAT CGCTTGAACC 5315 
CAGGAGATGG AGGTTGCAGT GAGCTGAGAT TGTACCACTG CACTCCAGTC TGGGCAACAG 5375 
AGCAAGATTT CATCACACAC ACACACACAC ACACACACAC ACACATTAGA AATGTGTACT 5435 
TGGCTTTGTT ACCTATGCTA TTAGTGCATC TATTGCATGG AACTTCCAAG CTACTCTCCT 5495 
TGTGTTAAGC TCTTCATTGG GTACAGGTCA CTAGTATTAA GTTCAGGTTA TTCGGATGCA 5555 
TTCCACGGTA GTGATGACAA TTCATCAGGC TAGTGTGTGT GTTCACCTTG TCACTCCCAC 5615 
CACTAGACTA ATCTCAGACC TTCACTCAAA GACACATTAC ACTAAAGATG ATTTGCTTTT 5675 
TTGTGTTTAA TCAAGCAATG GTATAAACCA GCTTGACTCT CCCCAAACAG TTTTTCGTAC 5735 
TACAAAGAAG TTTATGAAGC AGAGAAATGT GAATTGATAT ATATATGAGA TTCTAACCCA 5795 
GTTCCAGCAT TGTTTCATTG TGTAATTGAA ATCATAGACA AGCCATTTTA GCCTTTGCTT 5855 
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TCTTATCTAA AAAAAAAAAA AAAAAAATGA ACGAAGGGGT ATTAAAAGGA GTGATCAAAT 5915 
TTTAACATTC TCTTTAATTA ATTCATTTTT AATTTTACTT TTTTTCATTT ATTGTGCACT 5975 
TACTATCTCG TACTGTGCTA TAGAGGCTTT AACATTTATA AAAACACTGT GAAAGTTGCT 6035 
TCAGATGAAT ATAGCTAGTA GAACGGCAGA ACTAGTATTC AAAGCCAGGT CTGATGAATC 6095 
CAAAAACAAA CACCCAnAC TCCCATTTTC TGGGACATAC TTACTCTACC CAGATGCTCT 6155 
GGGCTTTGTA ATGCCTATGT AAATAACATA GTTTTATGTT TGGTTATTTT CCTATGTAAT 6215 
GTCTACTTAT ATATCTGTAT CTATCTCTTG CTTTGTTTCC AAAGGTAAAC TATGTGTCTA 6275 
AATGTGGGCA AAAAATAACA CACTATTCCA AATTACTGTT CAAATTCCTT TAAGTCAGTG 6335 
ATAATTATTT GTTTTGACAT TAATCATGAA GTTCCCTGTG CGTACTAGGT AAACCTXTAA 6395 
TAGAATGTTA ATGTTTGTAT TCATTATAAG AATTTTTGGC TGTTACTTAT TTACAACAAT 6455 
ATTTCACTCT AATTAGACAT TTACTAAACT TTCTCTTGAA AACAATGCCC AAAAAAGAAC 6515 
ATTAGAAGAC ACGTAAGCTC AGTTGGTCTC TGCCACTAAG ACCAGCCAAC AGAAGCTTGA 6575 
TTTTATTCAA ACTTTGCATT TTAGCATATT TTATCTTGGA AAATTCAATT GTGTTGGTTT 6635 
TTTGTTTTTG TTTGTATTGA ATAGACTCTC AGAAATCCAA TTGTTGAGTA AATCTTCTGG 6695 
GTTTTCTAAC CTTTCTTTAG AT GTT ACC CTG TGT GAG GAG GCA TTC TTC AGG 6747 
Asp Val Thr Leu Cys Glu Glu Ala Phe Phe Arg 
180 185 

TTT OCT GTT CCT ACA AAG TTT ACG CCT AAC TGG CTT AGT GTC TTG GTA 6795 
Phe Ala Val Pro Thr Lys Phe Thr Pro Asn Trp Leu Ser Val Leu Val 
190 195 200 

GAC AAT TTG CCT GGC ACC AAA GTA AAC GCA GAG AGT GTA GAG AGG ATA 6843 
Asp Asn Leu Pro Gly Thr Lys Val Asn Ala Glu Ser Val Glu Arg He 
205 210 215 
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AAA CGG CAA CAC ACC TCA CAA GAA CAG ACT TTC CAG CTG CTG AAG TTA 6891 
Lys Arg Gla His Ser Ser Gin Glu Gin Thr Phe Gin Leu Leu Lys Leu 
220 225 230 235 

TGG AAA CAT CAA AAC AAA GAC CAA GAT ATA GTC AAG AAG ATC ATC CAA G 6940 
Trp Lys His Gin Asn Lys Asp Gin Asp He Val Lys Lys He He Gin 
240 245 250 

GTATGATAAT CTAAAATAAA AAGATCAATC AGAAATCAAA GACACCTATT TATCATAAAC 7000 
CAGGAACAAG ACTGCATGTA TGTTTAGTTG TGTGGATCTT GTTTCCCTGT TGGAATCATT 7060 
GTTGGACTGA AAAAGTTTCC ACCTGATAAT GTAGATGTGA TTCCACAAAC AGTTATACM 7120 
GGTTTTGTTC TCACCCCTGC TCCCCAGTTT CCTTGTAAAG TATGTTGAAC ACTCTAAGAG 7180 
AAGAGAAATG CATTTGAAGG CAGGGCTGTA TCTCAGGGAG TCGCTTCCAG ATCCCTTAAC 7240 
GCTTCTGTAA GCAGCCCCTC TAGACCACCA AGGAGAAGCT CTATAACCAC TTTGTATCTT 7300 
ACATTGCACC TCTACCAAGA AGCTCTGTTG TATTTACTTG GTAATTCTCT CCACGTAGGC 7360 
TTTTCGTAGC TTACAAATAT GTTCTTATTA ATCCTCATGA TATGGCCTGC ATfAAAATTA 7420 
TTTTAATGCC ATATGTTATG AGAATTAATG AGATAAAATC TGAAAAGTGT TTGAGCCTCT 7480 
TGTAGGAAAA AGCTAGTTAC AGCAAAATGT TCTCACATCT TATAAGTTTA TATAAAGATT 7540 
CTCCTTTAGA AATGGTGTCA GAGAGAAACA GAGAGAGATA GGGAGAGAAG TGTGAAAGAA 7600 
TCTGAAGAAA AGGAGTTTCA TCCAGTGTGG ACTGTAAGCT TTACGACACA TGATGGAAAG 7660 
AGTTCTGACT TCAGTAAGCA TTGGGACGAC ATGCTAGAAG AAAAAGGAAG AAGAGTTTCC 7720 
ATAATCCAGA CAGGGTCAGT GAGAAATTCA TTCAGGTCCT CACCAGTAGT TAAATGACTG 7780 
TATAGTCTTG CACTACCCTA AAAAACTTCA AGTATCTGAA ACCGGGGCAA CAGATTTTAG 7840 
GAGACCAACC TCTTTGAGAG CTGATTGCTT TTGCTTATCC AAAGAGTAAA CTTTTATGTT 7900 
TTGAGCAAAC CAAAAGTATT CTTTGAACGT ATAATTAGCC CTGAAGCCGA AAGAAAAGAG 7960 
AAAATCAGAG ACCGTTAGAA TTGGAAGCAA CCAAATTCCC TATTTTATAA ATGAGGACAT 8020 
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TTTAACCCAG AAAGATGAAC CGATTTGGCT TAGGGCTCAC AGATACTAAG TGACTCATGT 8080 
CATTAATAGA AATGTTAGTT CCTCCCTCTT ACGTTTGTAC CCTAGCTTAT TACTGAAATA 8140 
TTCTCTAGGC TGTGTGTCTC CTTTAGTTCC TCGACCTCAT GTCTTTGAGT TTTCAGATAT 8200 
CCTCCTCATG GAGGTAGTCC TCTGGTGCTA TGTGTATTCT TTAAAGGCTA GTTACGGCAA 8260 
TTAACTTATC AACTAGCGCC TACTAATGAA ACnTGTATT ACAAAGTAGC TAACTTGAAT 8320 
ACTTTCCTTT TTTTCTGAAA TGTTATGGTG GTAATTTCTC AAACTTTTtC TTAGAAAACT 8380 
GAGAGTGATG TGTCTTATTT TCTACTGTTA ATTTTCAAAA TTAGGAGCTT CTTCCAAAGT 8440 
TTTGTTGGAT GCCAAAAATA TATAGCATAT TATCTTATTA TAACAAAAAA TATTTATCTC 8500 
AGTTCTTAGA AATAAATGGT GTCACTTAAC TCCCTCTCAA AAGAAAAGGT TATCATTGAA 85B0 
ATATAATTAT GAAATTCTGC AAGAACCTTT TGCCTCACGC TTGTTTTArG ATGGCATTGG 8620 
ATGAATATAA ATGATGTGAA CACTTATCTG GGCTTTTGCT TTATGCAG AT ATT CAC 8676 

Asp He Asp 

CTC TGT GAA AAC AGC GTG CAG CGG CAC ATT GGA CAT GCT AAC CTC ACC 8724 
Leu Cys Glu Asn Ser Val Gin Arg His lie Gly His Ala Asa Leu Thr 
255 260 265 270 

TTC GAG CAG CTT CGT AGC TTG ATG GAA AGC TTA CCG GGA AAG AAA GTG 8772 
Phe Glu Gin Leu Arg Ser Leu Met Glu Ser Leu Pro Gly Lys Lys Val 
275 280 285 

GGA GCA GAA GAC ATT GAA AAA ACA ATA AAG GCA TGC AAA CCC AGT GAC 8820 
Gly Ala Glu Asp He Glu Lys Thr lie Lys Ala Cys Lys Pro Ser Asp 
290 295 300 

CAG ATC CTG AAG CTG CTC AGT TTG TGG CGA ATA AAA AAT GGC GAC CAA 8868 
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Gin lie Leu Lys Leu Leu Ser Leu Trp Arg He Lys Asn Gly Asp Gin 
305 310 315 

GAC ACC TTG AAG GGC CTA ATG CAC GCA CTA AAG CAC TCA AAG ACG TAC 8916 
Asp Thr Leu Lys Gly Leu llet His Ala Leu Lys His Ser Lys Thr Tyr 
320 325 330 

CAC TTT CCC AAA ACT GTC ACT CAG ACT CTA AAG AAG ACC ATC ACG TTC 8964 
His Phe Pro Lys Thr Val Thr Gin Ser Leu Lys Lys Thr lie Axg Phe 
335 340 345 350 

CTT CAC AGC TTC ACA ATG TAC AAA TTC TAT CAG AAG TTA TTT TTA CAA 9012 
Leu His Ser Phe Thr llet Tyr Lys Leu Tyr Gin Lys Leu Phe Leu Glu 
355 360 365 

ATG ATA GGT AAC CAG GTC CAA TCA GTA AAA ATA AGC TGC TTA 9054 
Uet lie Gly Asn Gin Val Cln Ser Val Lys lie Ser Cys Leu 
370 375 380 

TAACTGGAAA TGGCCATTCA GCTGTTTCCT CACAATTGGC GAGATCCCAT GGATCAGTAA 9114 

ACTGTTTCTC AGGCACTTGA GGCTTTCAGT CATATCTTTC TCATTACCAG TGACTAATTT 9174 

TGCCACAGGG TACTAAAACA AACTATGATG TGGAGAAAGG ACTAACATCT CCTCCAATAA 9234 

ACCCCAAATG GTTAATCCAA CTGTCAGATC TGGATCGTTA TCTACTGACT ATATTTTCCC 9294 

TTATTACTGC TTGCAGTAAT TCAACTGCAA ATTAAAAAAA AAAAACTACA CTCCACTCGG 9354 

CCTTACTAAA TATGGGAATG TCTAACTTAA ATAGCTTTGG GATTCCACCT ATGCTAGAGG 9414 

CTTTTATTAG AAAGCCATAT TTTTTTCTGT AAAAGTTACT AATATATCTG TAACACTATT 9474 
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ACAGTATTGC TATTTATATT CATTCAGATA TAAGATTTGG ACATATTATC ATCCTATAAA 9534 
GAAACGGTAT GACTTAATTT TAGAAAGAAA ATTATATTCT GTTTATTATG ACAAATGAAA 9594 
GAGAAAATAT ATATTTTTAA TGGAAACTTT GTAGCATTTT TCTAATAGGT ACTGCCATAT 9654 
TTTTCTGTGT GGAGTATTTT TATAATTTTA TCTGTATAAG CTGTAATATC ATTTTATAGA 9714 
AAATGCATTA TTTAGTCAAT TGTTTAATGT TGGAAAACAT ATGAAATATA AATTATCTGA 9774 
ATATTAGATG CTCTGAGAAA TTGAATGTAC CTTATTTAAA AGATTTTArG GTTTTATAAC 9834 
TATATAAATG ACATTATTAA AGTTTTCAAA TTATTTTTTA TTGCTTTCTC TGTTCCmT 9894 
ATTT 9898 

Sequence number: 3 
Length of sequence: 401 
Sequence Type: amino acid 
Strandedness: single stranded 
Topology: linear 
Molecular type: protein 

Sequence : 

Met Asa Asn Leu Leu Cys Cys Ala Leu Val Pbe Leu Asp He Ser 
-20 -15 -10 

He Lys Trp Thr Thr Gin Glu Thr Phe Pro Pro Lys Tyr Leu His 
-5 I 5 

Tyr Asp Glu Glu Thr Ser His Gin Leu Leu Cys Asp Lys Cys Pro 

10 15 20 

Pro Gly Thr Tyr Leu Lys Gin His Cys Thr Ala Lys Trp Lys Thr 

25 30 35 

Val Cys Ala Pro Cys Pro Asp His Tyr Tyr Thr Asp Ser Trp His 

40 45 50 
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He Gin Asp lie Asp Leu Cys Clu Asn Ser Val Gin Arg His He 

250 255 260 

Gly His Ala Asn Leu Thr Phe Glu Gin Leu Arg Ser Leu Met GIu 

265 270 275 

Ser Leu Pro Gly Lys Lys Val Gly Ala Glu Asp He Glu Lys Thr 

280 285 290 

He Lys Ala Cys Lys Pro Ser Asp Gin lie Leu Lys Leu Leu Ser 

295 800 305 

Leu Trp Arg He Lys Asn Gly Asp Gin Asp Thr Leu Lys Gly Leu 

310 315 320 

Met His Ala Leu Lys His Ser Lys Thr Tyr His Phe Pro Lys Thr 

325 330 '335 

Val Thr Gin Ser Leu Lys Lys Thr lie Arg Phe Leu His Ser Phe 

340 345 350 

Thr Met Tyr Lys Leu Tyr Gin Lys Leu Phe Leu Glu Met lie Gly 

355 360 365 

Asn Gin Val Gin Ser Yal Lys lie Ser Cys Leu 

370 375 -380 

Sequence number: 4 
Length of sequence: 1206 
Sequence Type: nucleic acid 
Strandedness : single stranded 
Topology: linear 
Molecular type: cDNA 
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Sequence: 
ATGMCAACT TGCTGTGCTG CCCGCTCCTC 

CAGGAAACGT TTCCTCCAAA GTACCTTCAT 
TGTGACAAAT GTCCTCCTGG TACCTACCTA 
GTGTGCGCCC CTTGCCCTGA CCACTACTAC 
CTATACTGCA GCCCCGTGTG CAAGGAGCTG 
CACAACCGCG TGTGCGAATG CAAGGAAGGG 
CATAGGAGCT GCCCTCCTGG ATTTGGAGTG 
GTTTGCAAAA GATGTCCAGA TGCGTTCTTC 
AGAAAACACA CAAATTGCAG TGTCTTTGGT 
CACGACAACA TATGTTCCGG AAACAGTGAA 
CTGTGTGAGG AGGCATTCTT CAGGTTTGCT 
AGTGTCTTGG TAGACAATTT GCCTGGCACC 
AAACGGCAAC ACAGCTCACA AGAACAGACT 
AACAAAGACC AAGATATAGT CAAGAAGATC 
GTGCAGCGGC ACATTGGACA TGCTAACCTC 
AGCTTACCGG GAAAGAAAGT GCGAGCAGAA 
CCCAGTGACC AGATCCTGAA GCTGCTCAGT 
ACCTTGAAGG GCCTAATGCA CGCACTAAAG 
GTCACTCAGA GTCTAAAGAA GACCATCAGG 
TATCAGAAGT TATTTTTAGA AATGATAGGT 
TTATAA 



TTTCTCGACA TCTCCATTAA GTGGACCACC 60 
TATGACGAAG AAACCTCTCA TCAGCTGTTG 120 
AAACAACACT GTACAGCAAA GTGGAAGACC 180 
ACAGACAGCT GGCACACCAG TGACGAGTGT 240 
CAGTACGTCA AGCAGGAGTG CAATCGCACC 300 
CGCTACCTTG AGATAGAGTT CTGCTTGAAA 360 
GTGCAAGCTG GAACCCCAGA GCGAAATACA 420 
TCAAATGAGA CGTCATCTAA AGCACCCTGT 480 
CTCCTGCTAA CTCAGAAAGG AAATGCAACA 540 
TCAACTCAAA AATGTGGAAT AGATGTTACC 600 
GTTCCTACAA ACTITACGCC TAACTGGCTT 660 
AAAGTAAACG CAGAGAGTGT AGAGAGGATA 720 
nCCAGCTCC TGAAGTTATG GAAACATCAA 780 
ATCCAAGATA TTGACCTCTG TGAAAACAGC 840 
ACCTTCGAGC AGCTTCGTAG CTTGATCGAA 900 
GACATTGAAA AAACAATAAA GGCATGCAAA 960 
TTGTGGCGAA TAAAAAATGG CGACCAAGAC 1020 
CACTCAAAGA CGTACCACTT TCCCAAAACT 1080 
TTCCTTCACA GCTTCACAAT GTACAAATTG 1140 
AACCACGTCC AATCAGTAAA AATAAGCTGC 1200 
1206 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: 

(A) NAME: SNOW BRAND MILK PRODUCTS CO., LTD 

(B) STREET : 1-1, NAEBOCHO 6-CHOMB 

(C) CITY: HIGASHI-KO, SAPPORO - SHI 

(D) STATE: HOKKAIDO 

(E) COUNTRY: jp 

(P) POSTAL CODE (ZIP) : NONE 

(ii> UsInG T^E I dT TI0N: " 0VBL ^ PR0CESS F ° R PROTEIN 

(iii) NUMBBJ 



<iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE : Floppy disk 

(B) COMPUTER : IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS -DOS 

(D) SOFTWARE : Patentln Release (fl.O, Version #1.25 (EPO) 
(V) CURRENT APPLICATION DATA: 

APPLICATION NUMBER: EP 97935810 8 
<vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: JP 235928/96 

(B) FILING DATE: 19 -AUG- 1996 

(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1316 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA (human OCIF genomic DNA-1) 



(xi) 



DESCRIPTION: SEQ ID NO:l: 



CTGGAGACAT ATAACTTGAA CACTTGGCCC TGATGGGGAA GCAGCTCTGC AGGGACTTTT 60 
TCAGCCATCT GTAAACAATT TCAGTGGCAA CCCGCGAACT GTAATCCATG AATGGGACCA 120 
CACTTTACAA GTCATCAAGT CTAACTTCTA GACCAGGGAA TTAATGGGGG AGACAGCGAA 180 
CCCTAGAGCA AAGTGCCAAA CTTCTGTCGA TAGCTTGAGG CTAGTGGAAA GACCTCGAGG 240 
AGGCTACTCC AGAAGTTCAG CGCGTAGGAA GCTCCGATAC CAATAGCCCT TTGATGATGG 300 
TGGGGTTGGT GAAGGGAACA GTGCTCCGCA AGGTTATCCC TGCCCCAGGC AGTCCAATTT 360 
TCACTCTGCA GATTCTCTCT GGCTCTAACT ACCCCAGATA ACAAGGAGTG AATGCAGAAT 420 
AGCACGGGCT TTAGGGCCAA TCAGACATTA GTTAGAAAAA TTCCTACTAC ATGGTTTATG 480 
TAAACTTGAA GATGAATGAT TGCGAACTCC CCGAAAAGGG CTCAGACAAT GCCATGCATA 540 
AAGAGGGGCC CTGTAATTTG AGGTTTCAGA ACCCGAAGTG AAGGGGTCAG GCAGCCGGGT 600 
ACGGCGGAAA CTCACAGCTT TCGCCCAGCG AGAGGACAAA GGTCTGGGAC ACACTCCAAC 660 
TGCGTCCGGA TCTTGGCTGG ATCGGACTCT CAGGGTGGAG GAGACACAAG CACAGCAGCT 720 
GCCCAGCGTG TGCCCAGCCC TCCCACCGCT GGTCCCGGCT GCCAGGAGGC TGGCCGCTGG 780 
CGGGAAGGGG CCGGGAAACC TCAGAGCCCC GCGGAGACAG CAGCCGCCTT GTTCCTCAGC 840 
CCGGTGGCTT TTTTTTCCCC TGCTCTCCCA GGGGACAGAC ACCACCGCCC CACCCCTCAC 900 
GCCCCACCTC CCTGGGGGAT CCTTTCCGCC CCAGCCCTGA AAGCGTTAAT CCTGGAGCTT 960 
TCTGCACACC CCCCGACCGC TCCCGCCCAA GCTTCCTAAA AAAGAAAGGT GCAAAGTTTG 1020 
GTCCAGGATA GAAAAATGAC TGATCAAAGG CAGGCGATAC TTCCTGTTGC CGGGACGCTA 1080 
TATATAACGT GATGAGCGCA CGGGCTGCGG AGACGCACCG GAGCGCTCGC CCAGCCGCCG 1140 
CCTCCAAGCC CCTGAGGTTT CCGGGGACCA CA ATG AAC AAG TTG CTG TGC TGC 1193 
Met Asn Lys Leu Leu Cys CyB 
•20 -is 

GCG CTC GTG GTAAGTCCCT GGGCCAGCCG ACGGGTGCCC GGCGCCTGGG 1242 



55 



22 



EP 0 874 045 A1 



Ala Leu val 



(2) INFORMATION FOR SEQ ID NO: 2: 

<i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 9898 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND ED NES S : double 

(D) TOPOLOGY : linear 

<ii) MOLECULE TYPE: genomic DNA (human OCIF genomic DNA-2) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 

GCTTACTTTG TGCCAAATCT CATTAGGCTT AAGGTAATAC AGGACTTTGA GTCAAATGAT 
ACTGTTGCAC ATAAGAACAA ACCTATTTTC ATGCTAAGAT GATGCCACTG TGTTCCTTTC 
TCCTTCTAG TTT CTG GAC ATC TCC ATT AAG TGG ACC ACC CAG GAA ACG TTT 
Phe Leu Asp lie Ser He Lys Trp Thr Thr Gin Glu Thr Phe 



Pro Pro Lya Tyr Leu 

TGT GAC AAA TGT CCT CCT GGT ACC TAC CTA AAA CAA CAC TGT ACA GCA 
Cys Asp Lys Cys Pro Pro Gly Thr Tyr Leu Lys Gin His Cys Thr Ala 
20 25 30 35 

AAG TGG AAG ACC GTG TGC GCC CCT TGC CCT GAC CAC TAC TAC ACA GAC 
Lys Trp Lys Thr Val Cys Ala Pro Cys Pro Asp His Tyr Tyr Thr Asp 

AGC TGG CAC ACC AGT GAC GAG TGT CTA TAC TGC AGC CCC GTG TGC AAG 
ser Trp His Thr Ser Asp Glu Cys Leu Tyr Cys Ser Pro Val Cys Lys 
55 60 65 

GAG CTG CAG TAC GTC AAG CAG GAG TGC AAT CGC ACC CAC AAC CGC GTG 
Glu Leu Gin Tyr Val Lys Gin Glu Cys Asn Arg Thr His Asn Arg Val 
70 75 80 

TGC GAA TGC AAG GAA GGG CGC TAC CTT GAG ATA GAG TTC TGC TTG AAA 
Cys Glu Cys Lys Glu Gly Arg Tyr Leu Glu He Glu Phe Cys Leu Lys 



100 105 "I 

ATGTGCAGCA AAATTAATTA GGATCATGCA AAGTCAGATA GTTGTGACAG TTTAGGAGAA 569 
CACTTTTGTT CTGATGACAT TATAGGATAG CAAATTGCAA AGGTAATGAA ACCTGCCAGG 629 
TAGGTACTAT GTGTCTGGAG TGCTTCCAAA GGACCATTGC TCAGAGGAAT ACTTTGCCAC 689 
TACAGGGCAA TTTAATGACA AATCTCAAAT GCAGCAAATT ATTCTCTCAT GAGATGCATG 749 
ATGGTTTTTT W t T lT I TW TAAAGAAACA AACTCAAGTT GCACTATTGA TAGTTGATCT 809 
ATACCTCTAT ATTTCACTTC AGCATGGACA CCTTCAAACT GCAGCACTTT TTGACAAACA 869 
TCAGAAATGT TAATTTATAC CAAGAGAGTA ATTATGCTCA TATTAATGAG ACTCTGGAGT 929 
GCTAACAATA AGCAGTTATA ATTAATTATG TAAAAAATGA GAATGGTGAG GGGAATTGCA 989 
TTTCATTATT AAAAACAAGG CTAGTTCTTC CTTTAGCATG GGAGCTGAGT GTTTGGGAGG 1049 
GTAAGGACTA TAGCAGAATC TCTTCAATGA GCTTATTCTT TATC TTAGA C AAAACAGATT 1109 
GTCAAGCCAA GAGCAAGCAC TTGCCTATAA ACCAAGTGCT TTCTCTTTTG C ATTTTGAA C 1169 
AGCATTGGTC AGGGCTCATG TGTATTGAAT CTTTTAAACC AGTAACCCAC GTTTTTTTTC 1229 
TGCCACATTT GCGAAGCTTC AGTGCAGCCT ATAACTTTTC ATAGCTTGAG AAAATTAAGA 1289 
GTATCCACTT ACTTAGATGG AAGAAGTAAT CAGTATAGAT TCTGATGACT CAGTTTGAAG 1349 
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CAGTGTTTCT CAACTGAAGC CCTGCTGATA TTTTAAGAAA TATCTGGATT CCTAGGCTGG 1409 
ACTCCTTTTT GTGGGCAGCT GTCCTGCGCA TTGTAGAATT TTGGCAGCAC CCCTGGACTC 1469 
TAGCCACTAG ATACCAATAG CAGTCCTTCC CCCATGTGAC AGCCAAAAAT GTCTTCAGAC 1529 
ACTGTCAAAT GTCGCCAGGT GGCAAAATCA CTCCTGGTTG AGAACAGGGT CATCAATGCT 1589 
AAGTATCTGT AACTATTTTA ACTCTCAAAA CTTGTGATAT ACAAAGTCTA AATTATTAGA 1649 
CGACCAATAC TTTAGGTTTA AAGGCATACA AATGAAACAT TCAAAAATCA AAATCTATTC 1709 
TGTTTCTCAA ATAGTGAATC TTATAAAATT AATCACA G AA GATGCAAATT GCATCAGAGT 1769 
CCCTTAAAAT TCCTCTTCGT ATGAGTATTT GAGGGAGGAA TTGGTGATAG TTCCTACTTT 1829 
CTATTGGATG GTACTTTGAG ACTCAAAAGC TAAGCTAAGT TGTGTGTGTG TCAGGGTGCG 1889 
GGGTGTGGAA TCCCATCAGA TAAAAGCAAA TCCATGTAAT TCATTCAGTA AGTTGTATAT 1949 
GTAGAAAAAT GAAAAGTGGG CTATGCAGCT TGGAAACTAG AGAATTTTGA AAAATAATGG 2009 
AAATCACAAG GATCTTTCTT AAATAAGTAA GAAAATCTGT TTGTAGAATG AAGCAAGCAG 2069 
GCAGCCAGAA GACTCAGAAC AAAAGTACAC ATTTTACTCT GTGTACACTG GCAGCACAGT 2129 
GGGATTTATT TACCTCTCCC TCCCTAAAAA CCCACACAGC GGTTCCTCTT GGGAAATAAG 



GGGATTTATT TACCTCTCCC TCCCTAAAAA CCCACAUAUU w-n^ivn »»™»""» 
AGGTTTCCAG CCCAAAGAGA AGGAAAGACT ATGTGGTGTT ACTCTAAAAA GTATTTAATA 2249 
ACCGTTTTGT TGTTGCTGTT GCTGTTTTGA AATCAGATTG TCTCCTCTCC ATATTTTATT 2309 
TACTTCATTC TGTTAATTCC TGTGGAATTA CTTAGAGCAA GCATGGTGAA TTCTCAACTG 2369 
TAAAGCCAAA TTTCTCCATC ATTATAATTT CACATTTTGC CTGGCAGGTT ATAATTTTTA 2429 
TATTTCCACT GATAGTAATA AGGTAAAATC ATTACTTAGA TGGATAGATC TTTTTCATAA 2489 
AAAGTACCAT CAGTTATAGA GGGAAGTCAT GTTCATGTTC AGGAAGGTCA TTAGATAAAG 2549 
CTTCTGAATA TATTATGAAA CATTAGTTCT GTCATTCTTA GATTCTTTTT GTTAAATAAC 2609 
TTTAAAAGCT AACTTACCTA AAAGAAATAT CTGACACATA TGAACTTCTC ATTAGGATGC 2669 
AGGAGAAGAC CCAAGCCACA GATATGTATC TGAAGAATGA ACAAGATTCT TAGGCCCGGC 2729 
ACGGTGGCTC ACATCTGTAA TCTCAAGAGT TTGAGAGGTC AAGGCGGGCA GATCACCTGA 2789 
GGTCAGGAGT TCAAGACCAG CCTGGCCAAC ATGATGAAAC CCTGCCTCTA CTAAAAATAC 2849 
AAAAATTAGC AGGGCATGCT GGTGCATGCC TGCAACCCTA GCTACTCAGG AGGCTGAGAC 2909 
AGGAGAATCT CTTGAACCCT CGAGGCGGAG GTTGTGGTGA GCTGAGATCC CTCTACTGCA 2969 
CTCCAGCCTG GGTGACAGAG ATGAGACTCC GTCCCTGCCG CCGCCCCCGC CTTCCCCCCC 3029 
AAAAAGATTC TTCTTCATGC AGAACATACG GCAGTCAACA AAGGGAGACC TGGGTCCAGG 3089 
TGTCCAAGTC ACTTATTTCG AGTAAATTAG CAATGAAAGA ATGCCATGGA ATCCCTGCCC 3149 
AAATACCTCT GCTTATGATA TTGTAGAATT TGATATAGAG TTGTATCCCA TTTAAGGACT 3209 
AGGATGTAGT AGGAAAGTAC TAAAAACAAA CACACAAACA GAAAACCCTC TTTGCTTTGT 3269 
AAGGTGGTTC CTAAGATAAT GTCAGTGCAA TGCTGGAAAT AATATTTAAT ATGTGAAGGT 3329 
TTTAGGCTGT GTTTTCCCCT CCTGTTCTTT TTTTCTGCCA GCCCTTTGTC ATTTTTGCAG 3389 
GTCAATGAAT CATGTAGAAA GAGACAGGAG ATGAAACTAG AACCAGTCCA TTTTGCCCCT 3449 
TTTTTTATTT TCTGGTTTTG GTAAAAGATA CAATGAGGTA GGAGGTTGAG ATTTATAAAT 3509 
GAAGTTTAAT AAGTTTCTGT AGCTTTGATT TTTCTCTTTC ATATTTGTTA TCTTGCATAA 3569 
GCCAGAATTG GCCTGTAAAA TCTACATATG GATATTGAAG TCTAAATCTG TTCAACTAGC 3629 
TTACACTAGA TGGAGATATT TTCATATTCA GATACACTGG AATGTATGAT CTAGCCATGC 3689 
GTAATATAGT CAAGTGTTTG AAGGTATTTA TTTTTAATAG CGTCTTTAGT TGTGGACTGG 3749 
TTCAAGTTTT TCTGCCAATG ATTTCTTCAA ATTTATCAAA TATTTTTCCA TCATGAAGTA 3B09 
AAATGCCCTT GCAGTCACCC TTCCTGAAGT TTGAACGACT CTGCTGTTTT AAACAGTTTA 3869 
AGCAAATGGT ATATCATCTT CCGTTTACTA TGTAGCTTAA CTGCAGGCTT ACGCTTTTGA 3929 
GTCAGCGGCC AACTTTATTG CCACCTTCAA AAGTTTATTA TAATGTTGTA AATTTTTACT 3989 
TCTCAAGGTT AGCATACTTA GGAGTTGCTT CACAATTAGG ATTCAGGAAA GAAAGAACTT 4049 
CAGTAGGAAC TGATTGGAAT TTAATGATGC AGCATTCAAT GGGTACTAAT TTCAAAGAAT 4109 
GATATTACAG CAGACACACA GCAGTTATCT TGATTTTCTA GGAATAATTG TATGAAGAAT 4169 
ATGGCTGACA ACACGGCCTT ACTGCCACTC AGCGGAGGCT GGACTAATGA ACACCCTACC 4229 
CTTCTTTCCT TTCCTCTCAC ATTTCATGAG CGTTTTGTAG GTAACGAGAA AATTGACTTG 4289 
CATTTGCATT ACAAGGAGGA GAAACTGGCA AAGGGGATGA TGGTGGAAGT TTTGTTCTGT 4349 
CTAATGAAGT GAAAAATGAA AATGCTAGAG TTTTGTGCAA CATAATAGTA GCAGTAAAAA 4409 
CCAAGTGAAA AGTCTTTCCA AAACTGTGTT AAGAGGGCAT CTGCTGGGAA ACGATTTGAG 4469 
GAGAAGGTAC TAAATTGCTT GGTATTTTCC GTAG GA ACC CCA GAG CGA AAT ACA 4523 
Gly Thr Pro Glu Arg Asn Thr 
115 

GTT TGC AAA AGA TGT CCA GAT GGG TTC TTC TCA AAT GAG ACG TCA TCT 4571 

val Cys Lys Arg Cys Pro Asp Gly Phe Phe Ser Asn Glu Thr Ser Ser 



CTA ACT CAG AAA 



GGA AAT GCA ACA CAC GAC AAC ATA TGT TCC GGA AAC 
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c Gin Ly b Gly Asn Ala Thr His Asp Asn He Cys Ser Gly Asn 



170 175 

GTCTTTGTAC GATTTTGTAG TATCATCTCT CTCTCTGAGT TGAACACAAG GCCTCCAGCC 4775 
ACATTCTTGG TCAAACTTAC ATTTTCCCTT TCTTGAATCT TAACCAGCTA AGGCTACTCT 4835 
CGATGCATTA CTGCTAAAGC TACCACTCAG AATCTCTCAA AAACTCATCT TCTCACAGAT 4895 
AACACCTCAA AGCTTGATTT TCTCTCCTTT CACACTGAAA TCAAATCTTG CCCATAGGCA 4955 
AAGGGCAGTG TCAAGTTTGC CACTGAGATG AAATTAGGAG AGTCCAAACT GTAGAATTCA 5015 
CGTTGTGTGT TATTACTTTC ACGAATGTCT GTATTATTAA CTAAAGTATA TATTGGCAAC 5075 
TAAGAAGCAA AGTGATATAA ACATGATGAC AAATTAGGCC AGGCATGGTG GCTTACTCCT 5135 
ATAATCCCAA CATTTTGGGG GGCCAAGGTA GGCAGATCAC TTGAGGTCAG GATTTCAAGA 5195 
CCAGCCTGAC CAACATGGTG AAACCTTGTC TCTACTAAAA ATACAAAAAT TAGCTGGGCA 5255 
TGGTAGCAGG CACTTCTAGT ACCAGCTACT CAGGGCTGAG GCAGGAGAAT CGCTTGAACC 5315 
CAGGAGATGG AGGTTGCAGT GAGCTGAGAT TGTACCACTG CACTCCAGTC TGGGCAACAG 5375 
AGCAAGATTT CATCACACAC ACACACACAC ACACACACAC ACACATTAGA AATGTGTACT 5435 
TGGCTTTGTT ACCTATGGTA TTAGTGCATC TATTGCATGG AACTTCCAAG CTACTCTGGT 5495 
TGTGTTAAGC TCTTCATTGG GTACAGGTCA CTAGTATTAA GTTCAGGTTA TTCGGATGCA 5555 
TTCCACGGTA GTGATGACAA TTCATCAGGC TAGTGTGTGT GTTCACCTTG TCACTCCCAC 5615 
CACTAGACTA ATCTCAGACC TTCACTCAAA GACACATTAC ACT AAA GATG ATTTGCTTTT 5675 
TTGTGTTTAA TCAAGCAATG GTATAAACCA GCTTGACTCT CCCCAAACAG TTTTTCGTAC 5735 
TACAAAGAAG TTTATGAAGC AGAGAAATGT GAATTGATAT ATATATGAGA TTCTAACCCA 5795 
GTTCCAGCAT TGTTTCATTG TGTAATTGAA ATCATAGACA AGCCATTTTA GCCTTTGCTT 5855 
TCTTATCTAA AAAAAAAAAA AAAAAAATGA AGGAAGGGGT ATTAAAAGGA GTGATCAAAT 5915 
TTTAACATTC TCTTTAATTA ATTCATTTTT AATTTTACTT TTTTTCATTT ATTGTGCACT 5975 
TACTATGTGG TACTGTGCTA TAGAGGCTTT AACATTTATA AAAACACTGT GAAAGTTGCT 6035 
TCAGATGAAT ATAGGTAGTA GAACGGCAGA ACTAGTATTC AAAGCCAGGT CTGATGAATC 6095 
CAAAAACAAA CACCCATTAC TCCCATTTTC TGGGACATAC TTACTCTACC CAGATGCTCT 6155 
GGGCTTTGTA ATGCCTATGT AAATAACATA GTTTTATGTT TGGTTATTTT CCTATGTAAT 6215 
GTCTACTTAT ATATCTGTAT CTATCTCTTG CTTTGTTTCC AAAGGTAAAC TATGTGTCTA 6275 
AATGTGGGCA AAAAATAACA CACTATTCCA AATTACTGTT CAAATTCCTT TAAGTCAGTG 6335 
ATAATTATTT GTTTTGACAT TAATCATGAA GTTCCCTGTG GGTACTAGGT AAACCTTTAA 6395 
TAGAATGTTA ATGTTTGTAT TCATTATAAG AATTTTTGGC TGTTACTTAT TTACAACAAT 6455 
ATTTCACTCT AATTAGACAT TTACTAAACT TTCTCTTGAA AACAATGCCC AAAAAAGAAC 6515 
ATTAGAAGAC ACGTAAGCTC AGTTGGTCTC TGCCACTAAG ACCAGCCAAC AGAAGCTTGA 6575 
TTTTATTCAA ACTTTGCATT TTAGCATATT TTATCTTGGA AAATTCAATT GTGTTGGTTT 6635 
TTTGTTTTTG TTTGTATTGA ATAGACTCTC AGAAATCCAA TTGTTGAGTA AATCTTCTGG 6695 
GTTTTCTAAC CTTTCTTTAG AT GTT ACC CTG TGT GAG GAG GCA TTC TTC AGG 6747 
Asp Val Thr Leu Cys Glu Glu Ala Phe Phe Arg 
180 185 



GAC AAT TTG CCT GGC ACC AAA GTA AAC GCA GAG AGT GTA GAG AGG ATA 6843 
Asp Asn Leu Pro Gly Thr Lys Val Asn Ala Glu Ser Val Glu Arg He 
205 210 215 

AAA CGG CAA CAC AGC TCA CAA GAA CAG ACT TTC CAG CTG CTG AAG TTA 6891 
Lys Arg Gin His Ser Ser Gin Glu Gin Thr Phe Gin Leu Leu Lys Leu 
220 225 230 235 

TGG AAA CAT CAA AAC AAA GAC CAA GAT ATA GTC AAG AAG ATC ATC CAA G 6940 
Trp Lys His Gin Asn Lys Asp Gin Asp He Val Lys Lys He He Gin 



240 



245 250 



GTATGATAAT CTAAAATAAA AAGATCAATC AGAAATCAAA GACACCTATT T ATCAT AAA C 7000 
CAGGAACAAG ACTGCATGTA TGTTTAGTTG TGTGGATCTT GTTTCCCTGT TGGAATCATT 7060 
GTTGGACTGA AAAAGTTTCC ACCTGATAAT GTAGATGTGA TTCCACAAAC AGTTATACAA 7120 
GGTTTTGTTC TCACCCCTGC TCCCCAGTTT CCTTGTAAAG TATGTTGAAC ACTCTAAGAG 7180 
AAGAGAAATG CATTTGAAGG CAGGGCTGTA TCTCAGGGAG TCGCTTCCAG ATCCCTTAAC 7240 
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GCTTCTGTAA GCAGCCCCTC TAGACCACCA AGGAGAAGCT CTATAACCAC TTTGTATCTT 7300 
ACATTGCACC TCTACCAAGA AGCTCTGTTG TATTTACTTG GTAATTCTCT CCAGGTAGGC 7360 
TTTTCGTAGC TTACAAATAT GTTCTTATTA ATCCTCATGA TATGGCCTGC ATTAAAATTA 7420 
TTTTAATGGC ATATGTTATG AGAATTAATG AGATAAAATC TGAAAAGTGT TTGAGCCTCT 7480 
TGTAGGAAAA AGCTAGTTAC AGCAAAATGT TCTCACATCT TATAAGTTTA TATAAAGATT 7540 
CTCCTTTAGA AATGGTGTGA GAGAGAAACA GAGAGAGATA GGGAGAGAAG TGTGAAAGAA 7600 
TCTGAAGAAA AGGAGTTTCA TCCAGTGTGG ACTGTAAGCT TTACGACACA TGATGGAAAG 7660 
AGTTCTGACT TCAGTAAGCA TTGGGAGGAC ATGCTAGAAG AAAAAGGAAG AAGAGTTTCC 7720 
ATAATGCAGA CAGGGTCAGT GAGAAATTCA TTCAGGTCCT CACCAGTAGT TAAATGACTG 77B0 
TATAGTCTTG CACTACCCTA AAAAACTTCA AGTATCTGAA ACCGGGGCAA CAGATTTTAG 7840 
GAGACCAACG TCTTTGAGAG CTGATTGCTT TTGCTTATGC AAA G AGT AAA CTTTTATGTT 7900 
TTGAGCAAAC CAAAAGTATT CTTTGAACGT ATAATTAGCC CTGAAGCCGA AAGAAAAGAG 7960 
AAAATCAGAG ACCGTTAGAA TTGGAAGCAA CCAAATTCCC TATTTTATAA ATGAGGACAT 8020 
TTTAACCCAG AAAGATGAAC CGATTTGGCT TAGGGCTCAC AGATACTAAG TGACTCATGT 8080 
CATTAATAGA AATGTTAGTT CCTCCCTCTT AGGTTTGTAC CCTAGCTTAT TACTGAAATA 8140 
TTCTCTAGGC TGTGTGTCTC CTTTAGTTCC TCGACCTCAT GTCTTTGAGT TTTCAGATAT 8200 
CCTCCTCATG GAGGTAGTCC TCTGGTGCTA TGTGTATTCT TTAAAGGCTA GTTACGGCAA 8260 
TTAACTTATC AACTAGCGCC TACTAATGAA ACTTTGTATT ACAAAGTAGC TAACTTGAAT 8320 
ACTTTCCTTT TTTTCTGAAA TGTTATGGTG GTAATTTCTC AAACTTTTTC TTAGAAAACT 8380 
GAGAGTGATG TGTCTTATTT TCTACTGTTA ATTTTCAAAA TTAGGAGCTT CTTCCAAAGT 8440 
TTTGTTGGAT GCCAAAAATA TATAGCATAT TATCTTATTA TAACAAAAAA TATTTATCTC 8500 
AGTTCTTAGA AATAAATGGT GTCACTTAAC TCCCTCTCAA AAGAAAAGGT TATCATTGAA 8560 
ATATAATTAT GAAATTCTGC AAGAACCTTT TGCCTCACGC TTGTTTTATG ATGGCATTGG 8620 
ATGAATATAA ATGATGTGAA CACTTATCTG GGCTTTTGCT TTATGCAG AT ATT GAC 8676 

Asp He Asp 

CTC TGT GAA AAC AGC GTG CAG CGG CAC ATT GGA CAT GCT AAC CTC ACC 8724 
Leu Cys Glu Asn Ser Val Gin Arg His lie Gly His Ala Asn Leu Thr 



GGA GCA GAA GAC ATT GAA AAA ACA ATA AAG GCA TGC AAA CCC AGT GAC 8820 

Gly Ala Glu Asp He Glu Lys Thr He Lys Ala Cys Lys Pro Ser Asp 
290 295 300 

CAG ATC CTG AAG CTG CTC AGT TTG TGG CGA ATA AAA AAT GGC GAC CAA 8868 
Gin He Leu Lys Leu Leu Ser Leu Trp Arg He Lys Asn Gly Asp Gin 

305 310 315 

GAC ACC TTG AAG GGC CTA ATG CAC GCA CTA AAG CAC TCA AAG ACG TAC 8916 
asp Thr Leu Lys Gly Leu Met His Ala Leu Lys His Ser Lys Thr Tyr 

320 325 330 

CAC TTT CCC AAA ACT GTC ACT CAG AGT CTA AAG AAG ACC ATC AGG TTC 8964 
His Phe Pro Lys Thr Val Thr Gin Ser Leu Lys Lys Thr He Arg Phe 
335 340 345 350 

CTT CAC AGC TTC ACA ATG TAC AAA TTG TAT CAG AAG TTA TTT TTA GAA 9012 
Su His Ser Phe Thr Met Tyr Lys Leu Tyr Gin Lys Leu Phe Leu Glu 
355 360 365 

ATG ATA GGT AAC CAG GTC CAA TCA GTA AAA ATA AGC TGC TTA 9054 
Met He Gly Asn Gin val Gin Ser Val Lys He Ser Cys Leu 
370 375 . 380 

TAACTGGAAA TGGCCATTGA GCTGTTTCCT CACAATTGGC GAGATCCCAT GGATGAGTAA 9114 
Ictottt^c IgIcacttga GGCTTTCAGT GATATCTTTC TCATTACCAG TGACTAATTT 9174 
tSaIH TACTAAAAGA AACTATGATG TGGAGAAAGG ACTAACATCT CCTCCAATAA 9234 
IcCCCAAATG GTTAATCCAA CTGTCAGATC TGGATCGTTA TCTACTGACT AWTTTTCCC 9294 
TTATTACTGC TTGCAGTAAT TCAACTGGAA ATTAAAAAAA AAAAACTAGA CTCCACTGGG 9354 
CCTTACTAAA TATGGGAATG TCTAACTTAA ATAGCTTTGG GATTCCAGCT ATGCTAGAGG 9414 
CTTTTATTAG S»T TTTTTTCTGT AAAAGTTACT AATATATCTG TAACACTATT 9474 
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ACAGTATTGC TATTTATATT CATTCAGATA 
GAAACGGTAT GACTTAATTT TAGAAAGAAA 
GAGAAAATAT ATATTTTTAA TGGAAAGTTT 
TTTTCTGTGT GGAGTATTTT TATAATTTTA 
AAATGCATTA TTTAGTCAAT TGTTTAATGT 
ATATTAGATG CTCTGAGAAA TTGAATGTAC 
TATATAAATG ACATTATTAA AGTTTTCAAA 
ATTT 



TAAGATTTGG ACATATTATC ATCCTATAAA 9534 
ATTATATTCT GTTTATTATG ACAAATGAAA 9594 
GTAGCATTTT TCTAATAGGT ACTGCCATAT 9654 
TCTGTATAAG CTGTAATATC ATTTTATAGA 9714 
TGGAAAACAT ATGAAATATA AATTATCTGA 9774 
CTTATTTAAA AGATTTTATG GTTTTATAAC 9834 
TTATTTTTTA TTGCTTTCTC TGTTGCTTTT 9894 
9898 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 401 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: 

Met Asn Asn Leu Leu Cys Cys Ala Leu Val Phe Leu Asp lie Ser 

-20 -15 -10 

lie Lys Trp Thr Thr Gin Glu Thr Phe Pro Pro Lys Tyr Leu His 

-5 15 
Tyr Asp Glu Glu Thr Ser His Gin Leu Leu Cys Asp Lys Cys Pro 

10 15 20 

Pro Gly Thr Tyr Leu Lys Gin His Cys Thr Ala Lys Trp Lys Thr 

25 30 35 

Val Cys Ala Pro Cys Pro Asp His Tyr Tyr Thr Asp Ser Trp His 

40 45 50 

Thr Ser Asp Glu Cys Leu Tyr Cys Ser Pro Val Cys Lys Glu Leu 

55 . 60 65 

Gin Tyr Val Lys Gin Glu Cys Asn Arg Thr His Asn Arg Val Cys 

70 75 80 

Glu Cys Lys Glu Gly Arg Tyr Leu Glu lie Glu Phe Cys Leu Lys 

85 90 95 

His Arg Ser Cys Pro Pro Gly Phe Gly Val val Gin Ala Gly Thr 
100 105 110 

Pro Glu Arg Asn Thr val Cys Lys Arg Cys Pro Asp Gly Phe Phe 
115 120 125 

Ser Asn Glu Thr Ser Ser Lys Ala Pro Cys Arg Lys His Thr Asn 
130 135 140 

Cys Ser Val Phe Gly Leu Leu Leu Thr Gin Lys Gly Asn Ala Thr 
145 150 155 

His Asp Asn lie Cys Ser Gly Asn Ser Glu Ser Thr Gin Lys Cys 

160 165 170 

Gly lie Asp Val Thr Leu Cys Glu Glu Ala Phe Phe Arg Phe Ala 

175 180 185 

Val Pro Thr Lys Phe Thr Pro Asn Trp Leu Ser Val Leu Val Asp 

190 195 200 

Asn Leu Pro Gly Thr Lys val Asn Ala Glu Ser Val Glu Arg He 

205 210 215 

Lys Arg Gin His Ser Ser Gin Glu Gin Thr Phe Gin Leu Leu Lys 

220 225 230 

Leu Trp Lys His Gin Asn Lys Asp Gin Asp He Val Lys Lys He 

235 240 245 

He Gin Asp He Asp Leu Cys Glu Asn Ser Val Gin Arg His He 

250 255 260 

Gly His Ala Asn Leu Thr Phe Glu Gin Leu Arg Ser Leu Met Glu 

265 270 275 

Ser Leu Pro Gly Lys Lys Val Gly Ala Glu Asp He Glu Lys Thr 

280 285 290 

He Lys Ala Cys Lys Pro Ser Asp Gin He Leu Lys Leu Leu Ser 

295 300 305 
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Leu 


Trp 


Arg 


lie 


Lys 




Gly 


Asp Gin Asp 


Thr 




Lys 


Gly Leu 


310 










315 






320 








Met 


His 


Ala 






His 


Ser 


Lys Thr Tyr 


His 


Phe 


Pro 


Lys Thr 


325 










330 






335 








val 


Thr 


Gin 


Ser 


Leu 


Lys 


Lys 


Thr lie Arg 


Phe 


Leu 


His 


Ser Phe 


340 










345 






350 








Thr 


Met 


Tyr 


Lys 


Leu 


Tyr 


Gin 


Lys Leu Phe 


Leu 


Glu 


Met 


lie Gly 


355 










360 






365 








Asn 


Gin 


Val 


Gin 


Ser 


Val 


Lys 


lie Ser Cys 


Leu 








370 










375 






380 









(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1206 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: 

ATGAACAACT TGCTGTGCTG CGCGCTCGTG TTTCTGGACA TCTCCATTAA GTGGACCACC 60 
CAGGAAACGT TTCCTCCAAA GTACCTTCAT TATGACGAAG AAACCTCTCA TCAGCTGTTG 120 
TGTGACAAAT GTCCTCCTGG TACCTACCTA AAACAACACT GTACAGCAAA GTGGAAGACC 180 
GTGTGCGCCC CTTGCCCTGA CCACTACTAC ACAGACAGCT GGCACACCAG TGACGAGTGT 240 
CTATACTGCA GCCCCGTGTG CAAGGAGCTG CAGTACGTCA AGCAGGAGTG CAATCGCACC 300 
CACAACCGCG TGTGCGAATG CAAGGAAGGG CGCTACCTTG AGATAGAGTT CTGCTTGAAA 360 
CATAGGAGCT GCCCTCCTGG ATTTGGAGTG GTGCAAGCTG GAACCCCAGA GCGAAATACA 420 
GTTTGCAAAA GATGTCCAGA TGGGTTCTTC TCAAATGAGA CGTCATCTAA AGCACCCTGT 480 
AGAAAACACA CAAATTGCAG TGTCTTTGGT CTCCTGCTAA CTCAGAAAGG AAATGCAACA 540 
CACGACAACA TATGTTCCGG AAACAGTGAA TCAACTCAAA AATGTGGAAT AGATGTTACC 600 
CTGTGTGAGG AGGCATTCTT CAGGTTTGCT GTTCCTACAA AGTTTACGCC TAACTGGCTT 660 
AGTGTCTTGG TAGACAATTT GCCTGGCACC AAAGTAAACG CAGAGAGTGT AGAGAGGATA 720 
AAACGGCAAC ACAGCTCACA AGAACAGACT TTCCAGCTGC TGAAGTTATG GAAACATCAA 780 
AACAAAGACC AAGATATAGT CAAGAAGATC ATCCAAGATA TTGACCTCTG TGAAAACAGC 840 
GTGCAGCGGC ACATTGGACA TGCTAACCTC ACCTTCGAGC AGCTTCGTAG CTTGATGGAA 900 
AGCTTACCGG GAAAGAAAGT GGGAGCAGAA GACATTGAAA AAACAATAAA GGCATGCAAA 960 
CCCAGTGACC AGATCCTGAA GCTGCTCAGT TTGTGGCGAA TAAAAAATGG CGACCAAGAC 1020 
ACCTTGAAGG GCCTAATGCA CGCACTAAAG CACTCAAAGA CGTACCACTT TCCCAAAACT 1080 
GTCACTCAGA GTCTAAAGAA GACCATCAGG TTCCTTCACA GCTTCACAAT GTACAAATTG 1140 
TATCAGAAGT TATTTTTAGA AATGATAGGT AACCAGGTCC AATCAGTAAA AATAAGCTGC 1200 
TTATAA 1206 



Claims 

so 1 . A DNA comprising the nucleotide sequences of the Sequences No. 1 and No. 2 in the Sequence Table. 

2. The DNA according to claim 1 , wherein the Sequence ID No. 1 includes the first exon of the OCIF gene and the 
Sequence ID No. 2 includes the second, third, fourth, and fifth exons. 

55 3. A protein exhibiting the activity of inhibiting differentiation and/or maturation of osteoclasts and having the following 
physicochemical characteristics, 



i) molecular weight (SDS-PAGE): 
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0) Under reducing conditions: about 60 kD. 

(ii) Under non-reducing conditions: about 60 kD and about 120 kD; 

(b) amino acid sequence: 

includes an amino acid sequence of the Sequence ID No. 3 in the Sequence Table, 

(c) affinity: 

exhibits affinity to a cation exchanger and heparin, and 

(d) heat stability: 

0) the osteoclastogenesis-inhibitory activity is reduced when treated with heat at 70°C for 1 0 minutes or at 
56°C for 30 minutes. 

Cii) the osteoclastogenesis-inhibitory activity is lost when treated with heat at go-C for 10 minutes. 

A process for producing a protein exhibiting an activity of inhibiting differentiation and/or maturation of osteoclasts 
and having the following physicochemical characteristics. 

(a) molecular weight (SDS-PAGE): 

(i) Under reducing conditions: about 60 kD. 

(ii) Under non-reducing conditions: about 60 kD and about 120 kD; 

(b) amino acid sequence: 

includes an amino acid sequence of the Sequence ID No. 3 of the Sequence Table. 

(c) affinity: 

exhibits affinity to a cation exchanger and heparin, and 

(d) heat stability: 

(i) the osteoclastogenesis-inhibitory activity is reduced when treated with heat at 70°C for 10 minutes or at 
56°C for 30 minutes. 

(ii) the osteoclastogenesis-inhibitory activity is lost when treated with heat at 90°C for 10 minutes, 

the process comprising inserting a DNA including the nucleotide sequences of the sequences No. 1 and No. 2 in 
the Sequence Table into an expression vector, producing a vector capable of expressing a protein having the 
above-mentioned physicochemical characteristics and exhibiting the activity of inhibiting differentiation and/or mat- 
uration of osteoclasts, and producing this protein by a genetic engineering technique. 
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Figure 1 
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